<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Asimov Press: Data Briefs]]></title><description><![CDATA[Data snapshots on important facets of biological progress.]]></description><link>https://www.asimov.press/s/data-briefs</link><image><url>https://substackcdn.com/image/fetch/$s_!IQZz!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f45ea53-c2aa-4b05-bce8-6b022f8a0929_256x256.png</url><title>Asimov Press: Data Briefs</title><link>https://www.asimov.press/s/data-briefs</link></image><generator>Substack</generator><lastBuildDate>Sun, 03 May 2026 01:16:28 GMT</lastBuildDate><atom:link href="https://www.asimov.press/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Asimov Press]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[niko@asimov.com]]></webMaster><itunes:owner><itunes:email><![CDATA[niko@asimov.com]]></itunes:email><itunes:name><![CDATA[Asimov Press]]></itunes:name></itunes:owner><itunes:author><![CDATA[Asimov Press]]></itunes:author><googleplay:owner><![CDATA[niko@asimov.com]]></googleplay:owner><googleplay:email><![CDATA[niko@asimov.com]]></googleplay:email><googleplay:author><![CDATA[Asimov Press]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[The Price of E. Coli]]></title><description><![CDATA[Bioengineers commonly view microbes as reprogrammable &#8220;cellular factories&#8221; for manufacturing high-value molecules. But what are we throwing away?]]></description><link>https://www.asimov.press/p/price-of-ecoli</link><guid isPermaLink="false">https://www.asimov.press/p/price-of-ecoli</guid><dc:creator><![CDATA[Asimov Press]]></dc:creator><pubDate>Mon, 20 Oct 2025 19:16:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!BtY4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BtY4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BtY4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg 424w, https://substackcdn.com/image/fetch/$s_!BtY4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg 848w, https://substackcdn.com/image/fetch/$s_!BtY4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!BtY4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BtY4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg" width="1456" height="917" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:917,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2820464,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/173278982?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BtY4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg 424w, https://substackcdn.com/image/fetch/$s_!BtY4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg 848w, https://substackcdn.com/image/fetch/$s_!BtY4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!BtY4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3b95560-b1ec-4f9d-b706-3560470b695a_2000x1260.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>By Sam Clamons</strong></p><p>Metabolic engineering is the science (and art) of engineering living cells, usually bacteria or single-celled yeasts, to produce valuable molecules that can then be extracted, purified, and sold. Engineered microbes are already used to make familiar compounds like ethanol and acetone, and also more exotic molecules like 3&#8208;hydroxypropionic acid and squalene.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> The list of chemicals produced in research labs at small scales is <a href="https://mcf.computbiol.com/statistics.html">much longer</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> The market for microbial products is currently estimated to be around <a href="https://www.precedenceresearch.com/microbial-products-market">$200 billion per year</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><p>But forget, for just a moment, about engineering a microbe to produce a <em>new </em>compound. Consider, instead, a (hypothetical) future in which it is possible to isolate and sell all the molecules that <em>E. coli </em>already produces naturally. Imagine if there were a technology that made it simple &#8212; and inexpensive &#8212; to pulverize a bacterium, smear it out, collect each molecule in its own tube,<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> and sell these vials. How much would <em>E. coli </em>be worth?</p><p>The answer, according to my own estimates, is that the raw metabolites and macromolecules isolated from one liter of <em>E. coli </em>cells would be worth more than $600,000. This is more of a thought experiment than a serious economic analysis, but it suggests that we are collectively undervaluing the power and &#8220;technological&#8221; sophistication of even the smallest and simplest organisms.</p><p>To be clear, &#8220;fractionate <em>E. coli</em> and sell off its parts&#8221; is not a serious business proposal. Extracting even a single molecule from a microbial broth at high purity can be difficult, finicky, and expensive. And extracting <em>all</em> of the molecules, individually and at a high degree of purity, far exceeds today&#8217;s technical expertise. Nevertheless, quantifying the &#8220;parts price&#8221; of a microbe is a valuable way to understand the power and complexity of the &#8220;cellular factories&#8221; we use to make our medicines, fuels, and foods.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.asimov.press/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Exceptionally deep writing about biology. Always free. Subscribe!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The Method</h2><p>Bacteria are neither expensive nor difficult to grow. A 1-liter culture of <em>E. coli</em> bacteria can be made in a single night by mixing 25 g of Luria broth base ($2-6) with 1 liter of water, preferably deionized (salt-free). This liquid broth is put in an autoclave, or high-temperature pressure-cooker, and then transferred to a <a href="https://bellcoglass.com/product/3-x-dp-baff-shake-flask2000ml-38mm-delong-neck/">shake flask.</a> A small amount of <em>E. coli </em>is dropped into the sterile liquid, and then the whole thing is placed in a shaking incubator at 37&#176; C.</p><p>After 8 hours or so, the culture is &#8220;spun down&#8221; in a centrifuge to yield around one gram (about half a thumb&#8217;s worth) of waxy sludge. That&#8217;s the bacteria! The equipment required for all this is inexpensive and readily available. The whole procedure requires perhaps an hour of labor ($10-30).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xZSm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xZSm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png 424w, https://substackcdn.com/image/fetch/$s_!xZSm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png 848w, https://substackcdn.com/image/fetch/$s_!xZSm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png 1272w, https://substackcdn.com/image/fetch/$s_!xZSm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xZSm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png" width="640" height="853" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:853,&quot;width&quot;:640,&quot;resizeWidth&quot;:640,&quot;bytes&quot;:664929,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/173278982?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xZSm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png 424w, https://substackcdn.com/image/fetch/$s_!xZSm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png 848w, https://substackcdn.com/image/fetch/$s_!xZSm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png 1272w, https://substackcdn.com/image/fetch/$s_!xZSm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55719d47-3d93-4a7d-bed8-f3f098304a66_640x853.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A pellet of <em>E. coli </em>cells in a 1-liter vessel. Credit: Reddit <a href="https://www.reddit.com/r/labrats/comments/jfx7nl/damn_this_pellet_is_thick/">u/nikr_ecoli</a>.</figcaption></figure></div><p>The actual mass of bacteria yielded from a saturated overnight culture varies. Biologists typically gauge culture growth using &#8220;optical density&#8221; (OD), which measures how much light of a defined wavelength (typically 600 nm) is scattered when passing through the media.<em> E. coli</em> cells in LB media start to slow their growth somewhere between 0.5 and 1 OD. For the sake of this exercise, I&#8217;ve assumed a harvest density of exactly 1 OD, which, for a liter of media, is about one trillion <em>E. coli </em>cells<em>.</em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> One trillion cells is roughly equivalent to a 1 g pellet, representing <a href="https://doi.org/10.1371/journal.pone.0023126">300 mg of dry weight</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a></p><p>With that baseline in hand, the rest of the calculation just involved translating biology into numbers. I used many different sources to make my estimates, most notably the book <em><a href="https://book.bionumbers.org/">Cell Biology by the Numbers</a></em>, particularly its sections on the <a href="https://book.bionumbers.org/what-is-the-macromolecular-composition-of-the-cell/">macromolecular composition</a> of a cell and the <a href="https://book.bionumbers.org/what-are-the-concentrations-of-free-metabolites-in-cells/">concentration of metabolites</a> in a cell. All of the sources and calculations used to write this essay are <a href="https://cdn.prod.website-files.com/6518bf2f1b5b8c2edad6162d/68c42c3d27a07ac3f4465a0a_microbe_value_data_tables.xlsx">available for download</a>.</p><h2>Metabolites</h2><p>First, I looked at <em>E. coli</em>&#8217;s metabolites and small molecules &#8212; nucleotides, amino acids, simple sugars (maltose, glucose, hexose &#8230; ), their precursors and derivatives, and all the other small organic molecules contributing to a bacterium&#8217;s core metabolic loops. I didn&#8217;t include elemental ions or metals, since these are not made by the bacteria themselves.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a></p><p>The difficulty with cataloguing metabolites is that there are a <em>lot</em> of them. In an ideal world, one could randomly sample metabolites, perhaps by selecting a few particularly abundant or pricey ones, and extrapolate from those. In practice, however, we can use the data presented in this <a href="https://www.nature.com/articles/nchembio.186">2009 study</a>, which measures the concentrations of about one hundred common metabolites in <em>E. coli</em>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> Whereas most metabolomics studies only measure <em>relative</em> concentrations of their targets, this particular paper used carefully applied quantitative standards to arrive at absolute concentrations.</p><p>Obtaining prices for individual metabolites was tricky. Most chemical compounds are sold on the market at wildly different purities, each with a different value.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a> Many chemicals can be bought in different forms or counterbalanced with different ions.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a> Prices vary wildly between vendors &#8212; sometimes by orders of magnitude &#8212; because of bulk discounts, small market sizes, and volatile supply.</p><p>Ultimately, I relied on Sigma-Aldrich prices,<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a> taking the cheapest per-gram option I could find. I ignored purity and excluded odd or high-value counterions &#8212; sodium, potassium, free acids, and magnesium stayed in; lead, gold, ammonia, or carbon-based salts were out. Readers could certainly challenge my low-purity assumption; after all, we&#8217;re already relying on near-magical levels of purification for this thought experiment, so why not go all the way and assume we can extract them with near-perfect purity? But still, I wanted to get a reasonably stable, worst-case baseline.</p><p>Given these assumptions, the economic value of metabolites isolated from one liter of <em>E. coli </em>is a paltry $30-40.</p><div id="datawrapper-iframe" class="datawrapper-wrap outer" data-attrs="{&quot;url&quot;:&quot;https://datawrapper.dwcdn.net/EXFIy/2/&quot;,&quot;thumbnail_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e3abf1c-6ef2-4cac-831c-c78411651764_1220x604.png&quot;,&quot;thumbnail_url_full&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25380958-94e8-452d-829d-06b3542761c9_1220x1012.png&quot;,&quot;height&quot;:628,&quot;title&quot;:&quot;Abundant metabolites tend to be less valuable.&quot;,&quot;description&quot;:&quot;Selected prices for 108 metabolites found in E. coli. PRPP (phosphoribosyl pyrophosphate) is the most valuable metabolite analyzed, but is only present in trace amounts. Glutamate, on the other hand, is abundant but cheap.&quot;}" data-component-name="DatawrapperToDOM"><iframe id="iframe-datawrapper" class="datawrapper-iframe" src="https://datawrapper.dwcdn.net/EXFIy/2/" width="730" height="628" frameborder="0" scrolling="no"></iframe><script type="text/javascript">!function(){"use strict";window.addEventListener("message",(function(e){if(void 0!==e.data["datawrapper-height"]){var t=document.querySelectorAll("iframe");for(var a in e.data["datawrapper-height"])for(var r=0;r<t.length;r++){if(t[r].contentWindow===e.source)t[r].style.height=e.data["datawrapper-height"][a]+"px"}}}))}();</script></div><p>Surprisingly, the total parts price of metabolites is dominated by a handful of expensive molecules. Just three molecules (6-phosphogluconate, phosphoribosyl pyrophosphate, and succinyl-CoA) account for about half of the total economic value. All of the highest-value molecules are also fairly high in abundance <em>and</em> quite valuable on a per-molecule basis. At the end of the day, though, bacterial metabolites simply aren&#8217;t worth that much.</p><h2>Bulk Macromolecules</h2><p><em>E. coli</em>&#8217;s bulk macromolecules are, at least on paper, far more valuable than its small metabolites. These include cell membrane lipids, the carbohydrates coating the cell&#8217;s exterior, and glycogen molecules, which are used for long-term energy storage; anything that is a more complex arrangement of molecules. These molecules come in a wide range of compositions, and they aren&#8217;t generally sold sourced from <em>E. coli</em>, so I&#8217;ve used prices for comparable macromolecules from the other species noted below. I&#8217;ve also excluded one major cell membrane component, phosphatidylserine, because I was unable to find any suitable price listings.</p><div id="datawrapper-iframe" class="datawrapper-wrap outer" data-attrs="{&quot;url&quot;:&quot;https://datawrapper.dwcdn.net/rZ3Sn/2/&quot;,&quot;thumbnail_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5da5b6c2-8666-4857-9b77-66fd9940fcc7_1220x682.png&quot;,&quot;thumbnail_url_full&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a097144d-349a-4e25-9b87-b3c9a75ace78_1220x1108.png&quot;,&quot;height&quot;:427,&quot;title&quot;:&quot;Peptidoglycan, a part of the cell wall, is the most valuable bulk macromolecule.&quot;,&quot;description&quot;:&quot;Sigma-Aldrich prices for a small selection of bulk macromolecules. Most phosphatidylglycerol sold on the market, for example, comes from chicken eggs. Peptidoglycan is already harvested from B. subtilis bacteria.&quot;}" data-component-name="DatawrapperToDOM"><iframe id="iframe-datawrapper" class="datawrapper-iframe" src="https://datawrapper.dwcdn.net/rZ3Sn/2/" width="730" height="427" frameborder="0" scrolling="no"></iframe><script type="text/javascript">!function(){"use strict";window.addEventListener("message",(function(e){if(void 0!==e.data["datawrapper-height"]){var t=document.querySelectorAll("iframe");for(var a in e.data["datawrapper-height"])for(var r=0;r<t.length;r++){if(t[r].contentWindow===e.source)t[r].style.height=e.data["datawrapper-height"][a]+"px"}}}))}();</script></div><h2>Proteins</h2><p>Cell membranes and cell walls are neither the most abundant nor the most complex cellular macromolecules. That honor goes to proteins, the molecular machines that do much of the work of maintaining and growing a cell.</p><p>Proteins are hundreds of times larger than simple metabolites; compared to cell wall and membrane macromolecules, they are also made from a greater diversity of subunits and constructed far more specifically. Together, the proteins in an<em> E. coli</em> also weigh more than all other types of molecules, including lipids, metabolites, and carbohydrates, combined. Therefore, from the beginning of this thought experiment, I suspected most of the economic value of <em>E. coli</em> would come from proteins.</p><p>Unfortunately, <em>E. coli</em> has around 4,300 distinct protein-coding genes, each encoding a unique protein. As with metabolites, that&#8217;s too long a parts pricelist to cover comprehensively here. Therefore, I decided to focus on commonly-used proteins, abundant proteins, and some random selections.</p><p>In the first scenario, I thought most of the dollar value of the <em>E. coli</em> proteome might come from the proteins that scientists already use in large quantities. After all, all things being equal, high demand drives higher prices. I picked recA,<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a> alkaline phosphatase,<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a> and the exonuclease V complex<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-14" href="#footnote-14" target="_self">14</a> as representative highly-used proteins, based on a combination of personal familiarity and my ability to find prices and <em>in vivo</em> concentrations for each. Secondly, I looked at a few of <em>E. coli</em>&#8217;s most abundant proteins. Specifically, I used the two main ribosomal proteins, as well as the five most abundant proteins by copy number.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-15" href="#footnote-15" target="_self">15</a> And finally, I randomly sampled proteins from the <em>E. coli</em> genome, selecting ten proteins for which I could find both copy number and price information.</p><p>The outer membrane protein Lpp &#8212; which is the most abundant protein in <em>E. coli </em>and lends the cell membrane its structural rigidity &#8212; is an interesting outlier, worth almost $50,000 per liter on its own.<em> </em>The per-gram cost of the other proteins, however, doesn&#8217;t vary all that much. Most proteins cost between $1 million and $3 million per gram.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-16" href="#footnote-16" target="_self">16</a> This lets us extrapolate a total parts price for the entire <em>E. coli</em> proteome without having to price out all 4,300 individual proteins; assuming an average price of $3 million/g, a liter of <em>E. coli</em> contains almost exactly half a million dollars worth of assorted proteins.</p><div id="datawrapper-iframe" class="datawrapper-wrap outer" data-attrs="{&quot;url&quot;:&quot;https://datawrapper.dwcdn.net/1FCcZ/3/&quot;,&quot;thumbnail_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cd5a2ea1-6f08-4f43-b3b6-c3ad6c3c0e3b_1220x760.png&quot;,&quot;thumbnail_url_full&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9b3efd90-bb89-4936-84c8-bd79c22c99e3_1220x1146.png&quot;,&quot;height&quot;:856,&quot;title&quot;:&quot;Abundance and value of selected E. coli proteins.&quot;,&quot;description&quot;:&quot;LPP, the most common protein in the cell, is also quite valuable on a per-gram basis. Highly valuable proteins, such as Exonuclease V, also tend to be far less abundant inside the cell. Note that both axes are log-scale.&quot;}" data-component-name="DatawrapperToDOM"><iframe id="iframe-datawrapper" class="datawrapper-iframe" src="https://datawrapper.dwcdn.net/1FCcZ/3/" width="730" height="856" frameborder="0" scrolling="no"></iframe><script type="text/javascript">!function(){"use strict";window.addEventListener("message",(function(e){if(void 0!==e.data["datawrapper-height"]){var t=document.querySelectorAll("iframe");for(var a in e.data["datawrapper-height"])for(var r=0;r<t.length;r++){if(t[r].contentWindow===e.source)t[r].style.height=e.data["datawrapper-height"][a]+"px"}}}))}();</script></div><h2>Nucleic Acids</h2><p>The last major class of <em>E. coli</em> parts to price out are the nucleic acids: DNA and RNA.</p><p>Large DNA with a relatively arbitrary but fixed sequence isn&#8217;t nearly as valuable as, say, large DNA synthesized to order with a desired sequence, but it does have its uses. Bulk bacteriophage DNA is commonly used as a <a href="https://www.illumina.com/products/by-type/sequencing-kits/cluster-gen-sequencing-reagents/phix-control-v3.html">sequencing control</a> and, a bit less frequently, as a backbone for <a href="https://nvlpubs.nist.gov/nistpubs/jres/126/jres.126.001.pdf">DNA origami structures</a>. Using New England Biolabs&#8217; price for M13 phage DNA as a guide, I estimate that a liter of <em>E. coli </em>genomes is worth about $21,500. This number assumes that an <em>E. coli</em> genome is just as useful per gram as a phage genome, even though it is about a thousand times larger. It&#8217;s unclear whether that should make it more useful<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-17" href="#footnote-17" target="_self">17</a> or less useful.</p><p>RNA is simpler to price out, even though it comes in three major forms with vastly different shapes and functions (messenger RNA transcripts, transfer RNAs, and ribosomal RNAs). Thermo Fisher Sciences already <a href="https://www.thermofisher.com/order/catalog/product/AM7940?SID=srch-srp-AM7940">sells total </a><em><a href="https://www.thermofisher.com/order/catalog/product/AM7940?SID=srch-srp-AM7940">E. coli</a></em><a href="https://www.thermofisher.com/order/catalog/product/AM7940?SID=srch-srp-AM7940"> RNA</a> at a current price of $423 for 200 micrograms, which implies a parts price of about $127,000 per liter. This is not as much as the entire proteome&#8217;s economic value, but it is more than twice the value of the six most abundant proteins in <em>E. coli</em>. This bulk price is also a lower bound figure; it&#8217;s entirely possible that the RNA pool could sell for even more as individual, purified RNAs.</p><h2>Is <em>E. coli</em> a Money Printer?</h2><p>At first glance, it seems there&#8217;s a great business opportunity in cannibalizing <em>E. coli</em> for parts. At $627,000/liter, a modest biology lab could easily grow millions of dollars&#8217; worth of <em>E. coli</em> overnight, every night. Surely someone could make a killing that way?</p><p>Unfortunately not. Fractionating an <em>E. coli</em> cell &#8212; that is, blending up a cell and isolating its parts &#8212; is much more expensive than growing it! There are already scientific kits available on the market that can be used for extracting individual proteins or nucleic acids for tens of dollars per purification. But those kits also waste, or discard, all other cellular components. Getting all of the individual proteins (where most of the economic value sits) out of a bacterium at high efficiency and purity would require new technology.</p><p>RNAs are a better proposition, as there are relatively cheap kits for extracting total RNA from a variety of biological samples. But then you&#8217;d be directly competing with Thermo Fisher on a functionally identical product! Does the high price on Thermo Fisher&#8217;s total RNA mean that they&#8217;re making insane margins that a competitor could cut in on? Or does it mean that the extraction of bacterial RNA at scale is more expensive than it looks at a glance?</p><p>The second problem with selling <em>E. coli </em>parts is a lack of demand. The proteins expressed in a liter of cells may be worth $500,000 <em>on paper</em>, but unfortunately, there isn&#8217;t much of a market for most of them.</p><p>Big lab suppliers like Thermo Fisher or Sigma Aldrich don&#8217;t bother selling obscure proteins such as &#8220;sapF&#8221; to scientists because hardly anyone needs them. There&#8217;s no real market. To find prices for a large number of proteins, I had to look at smaller vendors like <a href="https://www.mybiosource.com/">MyBioSource</a>, which will sell you just about any protein in tiny amounts. But in those cases, you&#8217;re really paying for the service of custom purification, not for the protein itself. The price reflects the hassle of isolating it in small batches, not steady demand. If someone actually tried to mass-produce every <em>E. coli</em> protein this way, they&#8217;d run out of customers very quickly.</p><p>Put differently, that $627,000 per liter is only a notional value &#8212; what today&#8217;s customers <em>might</em> pay for one more liter of E. coli, given current catalog prices. The figure is high because, somewhere out there, a few biologists need small amounts of obscure proteins for unusual experiments. Once those needs are met, though, the extra liters wouldn&#8217;t fetch nearly as much. In the end, breaking down E. coli for parts would run into the same problem as <a href="https://doi.org/10.1016/j.actaastro.2019.05.009">asteroid mining</a>: the market isn&#8217;t nearly as big as the sticker prices make it look.</p><p>Based on our parts price stories, it seems the value of <em>E. coli</em>-derived molecules has a lot less to do with producing individual molecules and more to do with linking those molecules together in intricate, precise ways. Monomers and high-energy metabolites are barely worth more than the broth that <em>E. coli</em> feed on. Crude, bulk polymers of sugars and lipids are much more valuable. But the real economic value of living machines seems to be in their ability to produce high-complexity, highly-specific protein (and to a lesser extent, RNA) polymers.</p><h2>Transmuting Trash</h2><p>As a synthetic biologist in graduate school, I used to grow a lot of <em>E. coli</em>. Sometimes I grew them as experimental subjects. More often, though, I used them as factories to pump out plasmids destined for use elsewhere.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-18" href="#footnote-18" target="_self">18</a> A typical day in the lab might start by spinning down a couple dozen tubes of 5 mL cultures of overnight <em>E. coli</em> growth, each with a different plasmid, yielding pale yellow, gummy smears or plugs of bacteria that I would break open and carefully filter for the plasmids they carried. At the end of that day, after transforming a new batch of bacteria with new sets of plasmids, the bacteria would be seeded into 5 mL tubes of fresh media, which would go into a shaking incubator to grow overnight for harvesting the next morning.</p><p>The work made it easy to develop a distaste for <em>E. coli</em>. I was always focused on extracting the precious plasmids they carried; the bacteria themselves were just trash, as easy to make as ice in a freezer and just as easy to toss down the drain (after proper sterilization, of course). Everyone in the lab knew that manufacturing a few thousand bases of DNA cost us about $10-20 in media, purification equipment, and labor. We never considered that every purification also involved throwing away thousands of dollars' worth of assorted proteins, RNAs, lipids, glycans, and metabolites.</p><p>If you listen to a metabolic engineer describe their work, you might hear them hail bacteria as quasi-magical self-replicating nanofactories capable of transmuting literal gunk into gold. But if you then visit that same engineer&#8217;s laboratory, you will see those very nanofactories treated as <a href="https://www.asimov.press/p/slime">base sludge</a> &#8212; used and then discarded. It&#8217;s hard for us, gifted as we are with eyes that can&#8217;t actually see a bacterium, to hold <em>E. coli</em>&#8217;s sophistication in our heads for very long. My hope is that, by breaking down the bacteria into its component parts &#8212; at least on paper &#8212; we can more easily come to appreciate it.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.asimov.press/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.asimov.press/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>Samuel Clamons </strong>is a bioinformatics scientist at Illumina, Inc. with a PhD in Bioengineering and training in applied mathematics and computer science. Outside of his day job, he writes science fiction and researches theoretical questions in biology at <em>Asimov Press</em>.</p><p><strong>Cite: </strong>Clamons, Samuel. &#8220;The Price of <em>E. coli</em>.&#8221; <em>Asimov Press </em>(2025). https://doi.org/10.62211/82ue-71kj</p><p><em>Thanks to Eryney Marrogi and Ella Watkins-Dulaney for reading a draft of this. Lead image by Ella Watkins-Dulaney.</em></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>This <a href="https://doi.org/10.1111/1751-7915.12385">article</a> lists 13 companies active in 2016 selling products of metabolic engineering. To my surprise, as of 2025, eight still have active websites, with most selling product; two have gone bankrupt or shut down, and the other three still appear in industry summaries and news articles but either don&#8217;t have easily-accessible websites or appear to have had their websites hijacked.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>See <a href="https://doi.org/10.1186/s13068-023-02419-8">this article</a> for more information about the MCF2Chem database of research-scale biosynthetically created compounds.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Though most of this value currently comes from food additives and from microbe-derived foods, such as beer and cheese. For scale, though, the current <a href="https://straitsresearch.com/report/bioplastics-market">yearly bioplastics market alone is estimated at $26 billion</a>, <a href="https://straitsresearch.com/report/bio-alcohol-market">bio-alcohols are $11 billion</a>, and biologics (many of which, to be fair, are made in eukaryotes instead of bacteria) <a href="https://straitsresearch.com/report/biologics-market">are $0.5 trillion</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Perhaps using some hyper-effective form of liquid chromatography?</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>In practice, LB cultures easily reach 2 or 3 OD, and can be pushed quite a bit higher, though at high densities, OD isn&#8217;t linear with respect to actual bacteria density (so a 3 OD culture has quite a bit less than 3x as many bacteria as a 1 OD one). Consider 1 OD a conservative estimate.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>An <em>E. coli </em>cell is 70 percent water by mass.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>By this logic, I really shouldn&#8217;t include glucose or amino acids, which are provided directly to our precocious little cell factories in the LB broth. It turns out they don&#8217;t contribute much to the overall parts price for metabolites anyway, so I&#8217;ve merely included them for completeness.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>By my calculations, the metabolites listed in Bennet <em>et al.</em> add up to about 17 percent of total <em>E. coli</em> mass, which is about six times higher than the total metabolite pool estimate from <em>Cell Biology by the Numbers</em>. Clearly, there&#8217;s some discrepancy between different sources, so adjust your confidence accordingly; I nevertheless feel comfortable claiming that this list of 100-odd molecules represents &#8220;most&#8221; of the <em>E. coli</em> metabolome by mass and molecule count.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>As an example, Sigma-Aldrich sells citrate (as a salt with either sodium or potassium) in purities of 97 percent, 98 percent, 99 percent, 99.5 percent, 99-105 percent, &#8220;meets USP testing specifications,&#8221; &#8220;suitable for cell culture,&#8221; &#8220;Pharmaceutical Secondary Standard; Certified Reference Material,&#8221; &#8220;Molecular Biology Grade,&#8221; &#8220;EMPLURA&#174;,&#8221; and more.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p>Elaborating on the previous example, citrate comes as either a solution (with or without pH balancing) or as a crystalline salt counterbalanced with sodium, zinc, potassium, magnesium, ammonium-iron, or lead, sometimes at different hydrations.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p>Sigma-Aldrich is a well-respected, broadly-stocked one-stop-shop for most purchasable chemicals, and many other companies sell through their website portal.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-12" href="#footnote-anchor-12" class="footnote-number" contenteditable="false" target="_self">12</a><div class="footnote-content"><p>A critical enzyme for fixing DNA strand breaks, used in a variety of assays and reactions requiring DNA binding or recombination.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-13" href="#footnote-anchor-13" class="footnote-number" contenteditable="false" target="_self">13</a><div class="footnote-content"><p>An enzyme that removes phosphate groups from a variety of molecules, used in molecular cloning to functionally &#8220;block&#8221; the ends of DNA from sticking end-to-end.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-14" href="#footnote-anchor-14" class="footnote-number" contenteditable="false" target="_self">14</a><div class="footnote-content"><p>A complex of three enzymes (recB, recC, and recD) that chews back single-stranded overhangs in DNA complexes of mismatched length. Also known as recBCD.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-15" href="#footnote-anchor-15" class="footnote-number" contenteditable="false" target="_self">15</a><div class="footnote-content"><p>According to <a href="https://doi.org/10.1016/j.dib.2014.08.004">Wi&#347;niewski &amp; Rakus (2014)</a>, whose supplementary data I used as a primary source for protein concentrations and molecular weights throughout.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-16" href="#footnote-anchor-16" class="footnote-number" contenteditable="false" target="_self">16</a><div class="footnote-content"><p>The exceptions tell interesting stories of their own. Alkaline phosphatase, the cheapest protein by far, competes on the market with shrimp-derived alkaline phosphatase, functionally similar but extremely cheap to produce for some reason. Ribosomes are presumably cheaper to purify because they are both abundant and heavy. RecA is one of a small number of <em>E. coli</em> proteins produced at scale by New England Biolabs, making it cheaper than average. On the other end of the price spectrum, the exonuclease V complex, also sold by New England Biolabs, is likely expensive (per gram) because it <em>is</em> a complex of multiple proteins, tuned for maximum efficiency and conveniently packaged with buffers and optimized protocols &#8212; in other words, when you buy it, you&#8217;re really paying for more than just the proteins.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-17" href="#footnote-anchor-17" class="footnote-number" contenteditable="false" target="_self">17</a><div class="footnote-content"><p>An <em>E. coli</em> scaffold could be used to make DNA origami structures much larger than those made from, say, bacteriophage genomes.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-18" href="#footnote-anchor-18" class="footnote-number" contenteditable="false" target="_self">18</a><div class="footnote-content"><p>A plasmid is a small circle of DNA that replicates in a bacterial host, separate from the main chromosome. Plasmids typically hold between one and a few genes, and are frequently used by biologists as an easy-to-manipulate chassis for inserting new functionality. If the genome is a hard drive, a plasmid is a tiny flash drive.</p></div></div>]]></content:encoded></item><item><title><![CDATA[How Much Information is in DNA?]]></title><description><![CDATA[Answering this question may seem straightforward, but actually requires an odyssey through information theory and molecular biology.]]></description><link>https://www.asimov.press/p/dna-information</link><guid isPermaLink="false">https://www.asimov.press/p/dna-information</guid><dc:creator><![CDATA[dynomight]]></dc:creator><pubDate>Thu, 08 May 2025 13:11:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!l0q2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l0q2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l0q2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png 424w, https://substackcdn.com/image/fetch/$s_!l0q2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png 848w, https://substackcdn.com/image/fetch/$s_!l0q2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png 1272w, https://substackcdn.com/image/fetch/$s_!l0q2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l0q2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png" width="1456" height="917" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:917,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4572739,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/163086816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l0q2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png 424w, https://substackcdn.com/image/fetch/$s_!l0q2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png 848w, https://substackcdn.com/image/fetch/$s_!l0q2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png 1272w, https://substackcdn.com/image/fetch/$s_!l0q2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13995d23-5290-4872-969f-255cf9a0226d_2000x1260.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Do you like information theory? Do you like molecular biology? Do you like the idea of smashing them together and seeing what happens? If so, then here's a question: How much information is in your DNA?</p><p>When I first looked into <a href="https://malmesbury.substack.com/p/mechanisms-too-simple-for-humans?utm_source=share&amp;utm_medium=android&amp;r=4t71d6&amp;triedRedirect=true">this question</a>, I thought it was simple:</p><ol><li><p>Human DNA has about 3.1 billion base pairs.</p></li><li><p>Each base pair can take one of four values (A, T, C, or G)</p></li><li><p>It takes 2 bits to encode one of four possible values (00, 01, 10, or 11)</p></li><li><p>Thus, human DNA contains 6.2 billion bits.</p></li></ol><p>Easy, right? Sure, except:</p><ol><li><p>You have <em>two</em> versions of each base pair, one from each of your parents. Should you count both?</p></li><li><p>All humans have almost identical DNA. Does that matter?</p></li><li><p>DNA can be compressed. Should you look at the compressed representation?</p></li><li><p>It's not clear how much of our DNA actually does something useful. The insides of your cells are a convulsing pandemonium of interacting "hacks", designed to keep working even as mutations constantly screw around with the DNA itself. Should we only count the "useful" parts?</p></li></ol><p>Such questions quickly run into the limits of knowledge for both biology and computer science. To answer them, we need to figure out what exactly we mean by "information" and how that's related to what&#8217;s happening inside cells. In attempting that, I will lead you through a frantic tour of information theory and molecular biology. We'll meet some strange characters, including genomic compression algorithms based on deep learning, retrotransposons, and Kolmogorov complexity. </p><p>Ultimately, I'll argue that the intuitive idea of information in a genome is best captured by a new definition of a "bit" &#8212; one that's unknowable with our current level of scientific knowledge.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.asimov.press/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe to Asimov Press.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>On counting</strong></h2><p>What is "information"? This isn't <em>just</em> a pedantic question, as there are actually several different mathematical definitions of a "bit". Often, the differences don't matter, but for DNA, they turn out to matter a lot, so let's start with the simplest.</p><p>In the <strong>storage space definition</strong>, a bit is a "slot" in which you can store one of two possible values. If some object can represent <em>2&#8319;</em> possible patterns, then it contains <em>n</em> bits, regardless of which pattern actually happens to be stored.</p><p>So here's a question we can answer precisely: How much information <em>could</em> your DNA store?</p><p>A few reminders: DNA is a polymer. It's a long chain of chunks of ~40 atoms called "nucleotides". There are four different chunks, commonly labeled A, T, C, and G. In humans, DNA comes in 23 pieces of different lengths, called "chromosomes." Humans are "diploid," meaning we have two versions of each chromosome. We get one from each of our parents, made by randomly weaving together sections from the two chromosomes <em>they</em> got from <em>their</em> parents.</p><p>At least, that's true for the first 22 chromosomes. For the last, females have two "X" chromosomes, while males have one "X" and one "Y" chromosome. There's no mixing between these, so men pass on one to their children pretty much unchanged.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zoxM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zoxM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zoxM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zoxM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zoxM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zoxM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg" width="1456" height="1165" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1165,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:702719,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/163086816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zoxM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zoxM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zoxM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zoxM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b3acd1e-1194-4d41-9ade-d7e3be86dfdb_5000x4000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A karyotype showing the 23 sets of human chromosomes.</figcaption></figure></div><p>Chromosomes 1-22 have a total of 2.875 billion nucleotides; the X chromosome has 156 million, and the Y chromosome has 62 million. From here, we can calculate the total storage space in your DNA. Remember, each nucleotide has 4 options, corresponding to 2 bits. So if you're female, your total storage space is:</p><blockquote><p>(2&#215;2875 + 2&#215;156) million nucleotides</p><p>&#215; 2 bits / nucleotide</p><p>= 12.12 billion bits</p><p>= 1.51 GB.</p></blockquote><p>If you're male, the total storage space is:</p><blockquote><p>(2&#215;2875 + 156 + 62) million nucleotides</p><p>&#215; 2 bits / nucleotide</p><p>= 11.94 billion bits</p><p>= 1.49 GB.</p></blockquote><p>For comparison, a standard single-layer DVD can store 37.6 billion bits or 4.7 GB. The code for your body, magnificent as it is, takes up as much space as around 40 minutes of standard definition video.</p><p>So in principle, your DNA could represent around 2<sup>12,000,000,000</sup> different patterns. But hold on. Given human common ancestry, the chromosome pair you got from your mother is almost identical to the one you got from your father. And even ignoring that, there are long sequences of nucleotides that are repeated over and over in your DNA, enough to make up a significant fraction of the total. It seems weird to count all this repeated stuff. So perhaps we want a more nuanced definition of "information."</p><h2><strong>On compression</strong></h2><p>A string of 12 billion zeros is much longer than this article. But most people would (I hope) agree that this article contains more information than a string of 12 billion zeros. Why?</p><p>One of the fundamental ideas from information theory is to define information in terms of compression. Roughly speaking, the "information" in some string is the length of the shortest possible compressed representation of that string.</p><p>So how much can you compress DNA? Answers to this question are all over the place. Some people claim it can be compressed by more than 99 percent, while others claim the state of the art is only around 25 percent. This discrepancy is explained by different definitions of "compression", which turn out to correspond to different notions of "information".</p><p>If you pick any two random people on Earth, almost all of their DNA will be exactly the same. It's often said that people are 99.9 percent genetically identical, but this is wrong &#8212; it only measures substitutions and neglects things like insertions, deletions, and transpositions. If you account for all these things, the best estimate is that we are ~<a href="https://www.genome.gov/about-genomics/educational-resources/fact-sheets/human-genomic-variation">99.6 percent identical</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><p>The fact that we share so much DNA is key to how some algorithms can compress DNA by more than 99 percent. They do this by first storing a <em>reference</em> genome, which includes all the DNA that's shared by all people and perhaps the most common variants for regions of DNA where people differ. Then, for each individual person, these algorithms only store the <em>differences</em> from the reference genome. Because that reference only has to be stored once, it isn't counted in the compressed representation.</p><p>That&#8217;s great if you want to cram as many of your friends&#8217; genomes on a hard drive as possible. But it&#8217;s a strange definition to use if you want to measure the "information content of DNA". It implies that any genomic content that doesn't change between individuals isn't important enough to count as 'information'. However, we know from evolutionary biology that it's often the most crucial DNA that changes the least <em>precisely because</em> it's so important. Heritability tends to be <a href="https://dynomight.net/heritability/#example-heritability-in-different-species">lower</a> for genes more closely related to reproduction.</p><p>The best compression <em>without</em> a reference seems to be around <a href="https://doi.org/10.1093/gigascience/giaa119">25 percent</a>. (I expect this number to rise a bit over time, as the newest methods use deep learning and research is ongoing.) That's not a lot of compression. However, these algorithms are benchmarked in terms of how well they compress a genome that includes only <em>one</em> copy of each chromosome. Since your two chromosomes are almost identical (at least, ignoring the Y chromosome), I&#8217;d guess that you could represent the other half almost for free, meaning a compression rate of around 50 percent + &#189; &#215; 25 percent &#8776; 62 percent.</p><h2><strong>On information</strong></h2><p>So if you compress DNA using an algorithm with a reference genome, it can be compressed by more than 99 percent, down to less than 120 million bits. But if you compress it without a reference genome, the best you can do is 62 percent, meaning 4.6 billion bits.</p><p>Which of these is right? The answer is that <em>either</em> could be right. There are two different definitions of a &#8220;bit&#8221; in information theory that correspond to different types of compression.</p><p>In the <strong>Kolmogorov complexity definition</strong>, named after the remarkable Soviet mathematician Andrey Kolmogorov, a bit is a property of a particular string of <code>1</code>s and <code>0</code>s. The number of bits of information in the string is the length of the shortest computer program that would output that string.</p><p>In the <strong>Shannon information definition</strong>, named after the also-remarkable American polymath Claude Shannon, a bit is again a property of a particular sequence of <code>1</code>s and <code>0</code>s, but it's only defined relative to some large pool of possible sequences. In this definition, if a given sequence has a probability <em>p</em> of occurring, then it contains <em>n</em> bits for whatever value of <em>n</em> satisfies <em>2&#8319;=1/p</em>. Or, equivalently, <em>n=-log&#8322; p</em>.</p><p>The Kolmogorov complexity definition is clearly related to compression. But what about Shannon&#8217;s?</p><p>Well, say you have three beloved pet rabbits, Fluffles, Marmalade, and Sparklepuff. And say you have one picture of each of them, each 1 MB large when compressed. To keep me updated on how you're feeling, you like to send me these same pictures over and over again, with different pets for different moods. You send a picture of Fluffles &#189; the time, Marmalade &#188; of the time, and Sparklepuff &#188; of the time. (You only communicate in rabbit pictures, never with text or images.)</p><p>But then you decide to take off in a spacecraft, and your data rates go <em>way</em> up. Continuing the flow of pictures is crucial, so what's the cheapest way to do that? The best thing would be that we agree that if you send me a <code>0</code>, I should pull up the picture of Fluffles, while if you send <code>10</code> I should pull up Marmalade, and if you send <code>11</code>, I should pull up Sparklepuff. This is unambiguous: If you send <code>0011100</code>, that means Fluffles, then Fluffles again, then Sparklepuff, then Marmalade, then Fluffles one more time.</p><p>It all works out. The "code length" for Fluffles is the number <em>n</em> so that <em>2&#8319;=1/p</em>:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vSI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vSI5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png 424w, https://substackcdn.com/image/fetch/$s_!vSI5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png 848w, https://substackcdn.com/image/fetch/$s_!vSI5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png 1272w, https://substackcdn.com/image/fetch/$s_!vSI5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vSI5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png" width="1240" height="376" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:376,&quot;width&quot;:1240,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35091,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/163086816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vSI5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png 424w, https://substackcdn.com/image/fetch/$s_!vSI5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png 848w, https://substackcdn.com/image/fetch/$s_!vSI5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png 1272w, https://substackcdn.com/image/fetch/$s_!vSI5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccba50c9-f655-47c3-ab70-fae82c893daa_1240x376.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Intuitively, the idea is that if you want to send as few bits as possible over time, then you should give short codes to high-probability patterns and long codes to low-probability patterns. If you do this <em>optimally</em> (in the sense that you'll send the fewest bits over time), it turns out that the best thing is to code a pattern with probability <em>p</em> with about <em>n</em> bits, where <em>2&#8319;=p</em>. (In general, things don't work out quite this nicely, but you get the idea.)</p><p>In the Fluffles scenario, the Kolmogorov complexity definition would say that each of the images contains 1 MB of information since that's the smallest each image can be compressed. But under the Shannon information definition, the Fluffles image contains 1 bit of information, and the Marmalade and Sparklepuff images contain 2 bits. This is quite a difference!</p><p>Now, let's return to DNA. There, the Kolmogorov complexity definition basically corresponds to the best possible compression algorithm without a reference. As we saw above, the best-known current algorithm can compress by 62 percent. So, under the Kolmogorov complexity definition, DNA contains at most 12 billion &#215; (1-0.62) &#8776; 4.6 billion bits of information.</p><p>Meanwhile, under the Shannon information definition, you can assume that the distribution of all human genomes is known. The information in <em>your</em> DNA only includes the bits needed to reconstruct <em>your</em> genome. That's essentially the same as compressing with a reference. So, under the Shannon information definition, your DNA contains less than 12 billion &#215; (1-0.01) &#8776; 120 million bits of information.</p><p>While neither of these is "wrong" for DNA, I prefer the Kolmogorov complexity definition for its ability to best capture DNA that codes for features and functions shared by all humans. After all, if you're trying to measure how much "information" our DNA carries from our evolutionary history, surely you want to include that which has been universally preserved.</p><h2><strong>On biology</strong></h2><p>At some point, your high-school biology teacher probably told you (or will tell you) this story about how life works:</p><ol><li><p>First, your DNA gets transcribed into matching RNA.</p></li><li><p>Next, that RNA gets translated into protein.</p></li><li><p>Then the protein does Protein Stuff.</p></li></ol><p>If things were that simple, we could easily calculate the information density of DNA just by looking at what fraction of your DNA ever becomes a protein (only around 1 percent). But it's not that simple. The rest of your DNA does other important things, like regulating what proteins get made. Some of it seems to exist only for the purpose of copying itself. Some of it might do nothing, or it might do important things we don't even know about yet.</p><p>So let me tell you that story again with slightly more detail:</p><ol><li><p>In the beginning, your DNA is relaxing in the nucleus.</p></li><li><p>Some parts of your DNA, called <strong>promoters</strong>, are designed so that if certain proteins are nearby, they'll stick to the DNA.</p></li><li><p>If that happens, then a hefty little enzyme called "RNA polymerase" will show up, crack open the two strands of DNA, and start transcribing the nucleotides on one side into "pre-messenger RNA" (pre-mRNA).</p></li><li><p>Eventually, for one of several reasons, the enzyme will decide it's time to stop transcribing, and the pre-mRNA will detach and float off into the nucleus. At this point, it's a few thousand or a few tens of thousands of nucleotides long.</p></li><li><p>Then, my personal favorite macromolecular complex, the "spliceosome", grabs the pre-mRNA, cuts away most of it, and throws those parts away. The sections of DNA that code for the parts that are kept are called <strong>exons</strong>, while the sections that code for parts that are thrown away are called <strong>introns</strong>.</p></li><li><p>Next, another enzyme called "RNA guanylyltransferase" (we can't all be beautiful) adds a "cap" to one end, and an enzyme called "poly(A) polymerase" adds a "tail" to the other end.</p></li><li><p>The pre-mRNA is now all grown up and has graduated to being regular mRNA. At this point, it is a few hundred or a few thousand nucleotides long.</p></li><li><p>Then, some proteins notice that the mRNA has a tail, grab it, and throw it out of the nucleus into the cytoplasm, where the noble ribosome lurks.</p></li><li><p>The ribosome grabs the mRNA and turns it into a protein. It does this by starting at one end and looking at chunks of three nucleotides at a time, called "codons". When it sees a certain "start" pattern, it starts translating each chunk into one of 20 amino acids and continues until it sees a chunk with a "stop" pattern.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p></li><li><p>The resulting protein lives happily ever after.</p></li></ol><p>It's thought that ~<a href="https://en.wikipedia.org/wiki/Exon">1 percent</a> of your DNA is exons and ~<a href="https://www.ncbi.nlm.nih.gov/books/NBK595930/">24 percent</a> is introns. What's the rest of it doing?</p><p>Well, while the above dance is happening, other sections of DNA are "regulating" it. <strong>Enhancers</strong> are regions of DNA where a certain protein can bind and cause the DNA to physically bend so that some promoter somewhere else (typically within a million nucleotides) is more likely to get activated. <strong>Silencers</strong> do the opposite. <strong>Insulators</strong> block enhancers and silencers from influencing regions they shouldn't influence.</p><p>While that might sound complicated, we're just warming up. The same region of DNA can be both an intron and an enhancer <em>and/or</em> a silencer. That's right, in the middle of the DNA that codes for some protein, evolution likes to put DNA that regulates some other, distant protein. When it's not regulating, it gets transcribed into (probably useless) pre-RNA and then cut away and recycled by the spliceosome.</p><p>There's also structural DNA that's needed to physically manipulate the chromosomes. <strong>Centromeres</strong> are "attachment points" used when copying DNA during cell division. <strong>Telomeres</strong> are "extra" DNA at the ends of the chromosomes.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a></p><p>Further complicating this picture are many regions of DNA that code for RNA that's never translated into a protein but still has some function. Some regions make tRNA, whose job is to bring amino acids to the ribosome. Other regions make rRNA, which bundle together with some proteins to <em>become</em> the ribosome. There's siRNA, microRNA, and piRNA that screw around with mRNA produced. And there's scaRNA, snoRNA, rRNA, lncRNA, and mrRNA. Many more types are sure to be defined in the future, both because it's hard to know for sure if DNA gets transcribed, it's hard to know what functions RNA might have, and because academics have strong incentives to invent ever-finer subcategories.</p><p>There are also <strong>pseudogenes</strong>. These are regions of DNA that <em>almost</em> make proteins, but not quite. Sometimes, this happens because they lack a promoter, so they never get transcribed into mRNA. Other times, they might lack a start codon, so after their mRNA makes it to the ribosome, it never actually starts making a protein. Then, there are instances when the DNA has an early stop codon or a "frameshift" mutation meaning the alignment of the RNA into chunks of three gets screwed up. In these cases, the ribosome will often detect that something is wrong and <a href="https://en.wikipedia.org/wiki/Nonsense-mediated_decay">call for help</a> to destroy the protein. In other cases, a short protein is made that doesn't do anything.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a></p><h2><strong>On messiness</strong></h2><p>Why? Why is this all such a mess? Why is it so hard to say if a given section of DNA does anything useful?</p><p>Biologists hate "why" questions. We can't re-run evolution, so how can we say "why" evolution did things the way it did? Better to focus on <em>how</em> biological systems actually work. This is probably wise. But since I'm not a biologist (or wise), I'll give my theory: Cells work like this because DNA is under constant attack from mutations.</p><p>Mutations most commonly arise during cell replication. Your DNA is composed of around 250 billion atoms. Making a perfect copy of all those atoms is hard. Your body has amazing nanomachines with many redundant mechanisms to try to correct errors, and it's estimated that the error rate is less than <a href="https://doi.org/10.1038/cr.2008.4">one per billion nucleotides</a>. But with several billion nucleotides, mutations happen.</p><p>There are also <em>environmental</em> sources of mutations. Ultraviolet light has more energy than visible light. If it hits your skin, that energy can sort of knock atoms out of place. The same thing happens if you're exposed to radiation. Certain chemicals, like formaldehyde, benzene, or asbestos, can also do this or can interfere with your body's error correction tricks.</p><p>Finally, we return to the huge fraction of your DNA (~50-60 percent) that repeats of the same sequences. Some of this is caused by the machinery "slipping" while making a copy, leading to a loss or repetition of some DNA. There are also little sections of DNA called "transposons" that sort of trick your machinery into making another copy of those sections and then inserting them somewhere else in the genome.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a></p><p>Mutations in your regular cells will just affect <em>you,</em> but mutations in your sperm/eggs could affect all future generations. Evolution helps manage this through selection. Say you have 10 bad mutations, and I have 10 bad mutations, but those mutations are in different spots. If we have some babies together, some of them might get 13 bad mutations, but some might only get 7, and the latter babies are more likely to pass on their genes.</p><p>But as well as selection, cells seem designed to be extremely <em>robust</em> to these kinds of errors. Instead of <em>just</em> relying on selection, there are many redundant mechanisms to tolerate them without much issue.</p><p>And remember, evolution is a madman. If it decides to <em>tolerate</em> some mutation, everything else will be optimized against it. So even if a mutation is harmful <em>at first</em>, evolution may later find a way to make use of it.</p><h2><strong>On information again</strong></h2><p>So, in theory, how <em>should</em> we define the "information content" of DNA? I propose a definition I call the "phenotypic Kolmogorov complexity".<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> Roughly speaking, this is how short you <em>could</em> make DNA and still get a "human".</p><p>The "phenotype" of an animal is just a fancy way of referring to its "observable physical characteristics and behaviors". So this definition says, like Kolmogorov complexity, to try and find the shortest compressed representation of the DNA. But instead of needing to lead to the same DNA you have, it just needs to lead to an embryo that would look and behave like you do.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0nYR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0nYR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png 424w, https://substackcdn.com/image/fetch/$s_!0nYR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png 848w, https://substackcdn.com/image/fetch/$s_!0nYR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png 1272w, https://substackcdn.com/image/fetch/$s_!0nYR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0nYR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png" width="1456" height="920" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:920,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:405806,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/163086816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0nYR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png 424w, https://substackcdn.com/image/fetch/$s_!0nYR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png 848w, https://substackcdn.com/image/fetch/$s_!0nYR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png 1272w, https://substackcdn.com/image/fetch/$s_!0nYR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffad3da7e-46b0-429d-a818-a12d8c6cb424_5000x3158.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The idea is this: Take a single-cell human embryo with your DNA, and imagine all the different ways you can modify the DNA. This would include not only removing useless sections but also moving things around. Limit yourself to changes that still lead to a "person" that would still look like you and have all the same capabilities you do. Now, compress each of those representations. The smallest compressed representation is the "information" in your DNA.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a></p><p>So what would this number be? My <em>guess</em> is that you could reduce the amount of DNA by at least 75 percent, but not by more than 98 percent, meaning the information content is:</p><blockquote><p>12 billion bits</p><p>&#215; 2 bits / nucleotide</p><p>&#215; (2 to 25 percent)</p><p>= 480 million to 6 billion bits</p><p>= 60 MB to 750 MB</p></blockquote><p>But in reality, nobody knows. We still have no idea what (if anything) lots of DNA is doing, and we're a long way from fully understanding how much it can be reduced. Probably, no one will know for a long time.</p><div><hr></div><p><strong>Dynomight</strong> writes about science and dispenses life advice at <a href="https://dynomight.net/">dynomight.net</a>.</p><p><strong>Cite: </strong>Dynomight. &#8220;How Much Information is in DNA?&#8221; <em>Asimov Press </em>(2025). https://doi.org/10.62211/42ew-88gf</p><p>Lead image by Ella Watkins-Dulaney. Thanks to Sam Clamons for reading a draft of this essay.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Technically there's also a tiny amount of DNA in the mitochondria. This is neat because you get it from your mother basically unchanged and so scientists can trace tiny mutations back to see how our great-great-&#8230;-great grandmothers were all related. If you go far enough back, our maternal lines all lead to a single woman, Mitochondrial Eve, who probably lived in East Africa 120,000 to 156,000 years ago. But mitochondrial DNA is tiny so I won't mention it again.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Fun facts: Because of these deletions and insertions, different people have slightly different amounts of DNA. In fact, each of your chromosome pairs have DNA of slightly different lengths. When your body creates sperm/ova it uses a <a href="https://en.wikipedia.org/wiki/Synaptonemal_complex">crazy machine</a> to align the chromosomes in a sensible way so different sections can be woven together without creating nonsense. Also, those same measures of similarity would say that we're around 96 percent identical with our closest living cousins, the bonobos and chimpanzees.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Since there are 4 kinds of nucleotides, there are 4&#179;=64 possible chunks, while your body only uses 20 amino acids. So the ribosome, logically, gives some amino acids (like leucine) six different codons, and others (like tryptophan) only one codon. Also there are three different stop codons, but only one start codon, and that start codon is also the codon for methionine. So all proteins have methionine at one end unless something else comes and removes it later. Biology is layer after layer of this kind of exasperating complexity, totally indifferent to your desire to understand it.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Telomeres shrink as we age. The body has mechanisms to re-lengthen them, but it mostly only uses these in stem cells and reproductive cells. Longevity folks are interested in activating these mechanisms in other tissues to fight aging, but this is risky since the body seems to <em>intentionally</em> limit telomere repair as a strategy to prevent cancer cells from growing out of control.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>In more serious cases, these mutations might make the organism non-viable, or lead to problems like Tay-Sachs disease or Cystic fibrosis. But this wouldn't be considered a pseudogene.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>"DNA transposons" get cut out and stuck back in somewhere else, while "retrotransposons" create RNA that's designed to get reverse-transcribed back into the DNA in another location. There are also "retroviruses" like HIV that contain RNA that they insert into the genome. Some people theorize that retrotransposons can evolve into retroviruses and vice-versa.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>It's rare for retrotransposons to actually succeed in making a copy of themselves. They seem to have only a 1 in 100,000 or in 1,000,000 chance of copying themselves during cell division. But this is perhaps 10 times as high in the germ line, so the sperm from older men is more likely to contain such mutations.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>This has surely been proposed by someone before, but I can't find a reference, try as I might.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>This definition isn't totally precise, because I'm not saying how precise the phenotype needs to match. Even if there's some completely useless section of DNA and we remove it, that would make all your cells a tiny bit lighter. We need to tolerate <em>some</em> level of approximation. The idea is that it should be <em>very</em> close, but it's hard to make this precise.</p></div></div>]]></content:encoded></item><item><title><![CDATA[China’s Clinical Trial Boom]]></title><description><![CDATA[In 2017, there were just over 600 clinical trials initiated in China. By 2023, that number was nearly 2,000. How can American companies kick off a similar boom?]]></description><link>https://www.asimov.press/p/china-trials</link><guid isPermaLink="false">https://www.asimov.press/p/china-trials</guid><dc:creator><![CDATA[Hiya Jain]]></dc:creator><pubDate>Thu, 24 Apr 2025 13:20:06 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/28fefe6a-9028-4f7c-95e4-035b70e78f14_2000x1260.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In January, the Chinese company Deepseek released a reasoning AI model, called R1, that <a href="https://news.ycombinator.com/item?id=41999151">performs comparably</a> to OpenAI's GPT-4o Turbo model on certain AI benchmarks but was developed with fewer resources at a much lower cost. Marc Andreessen called it &#8220;one of the most amazing and impressive breakthroughs I&#8217;ve ever seen,&#8221; even as NVIDIA&#8217;s stock plummeted by 17 percent &#8212; the largest ever one-day loss for a U.S. company.</p><p>While China&#8217;s AI competitiveness may have blindsided the tech world, the <a href="https://www.csis.org/analysis/united-states-cannot-afford-disarray-china-strengthens-its-biopharmaceutical-industry">pharmaceutical industry</a> has already had quite a few &#8220;<a href="https://www.wsj.com/health/pharma/the-drug-industry-is-having-its-own-deepseek-moment-68589d70">DeepSeek Moments</a>&#8221; of its own.&nbsp;</p><p>About <a href="https://www.biospace.com/business/big-pharma-rushes-to-china-for-deal-prospecting-despite-regulatory-uncertainty">one-fourth of all clinical trials</a> and early drug development now happens in China. Large pharmaceutical companies in-license about a third of their experimental molecules from Chinese laboratories (meaning they purchase the rights to molecules developed by other research groups rather than discover them in-house), according to <a href="https://www.stifel.com/newsletters/investmentbanking/bal/marketing/healthcare/biopharma_timopler/2025/biopharmamarketupdate_outlook_2025.pdf">a report</a> by Stifel. Just a couple of years ago, this number was about 10 percent.</p><p>"Ten years ago, a major [pharmaceutical company] seeking their next breakthrough molecule would have turned to an American or European biotech,&#8221; writes biotechnologist <a href="https://atelfo.github.io/2024/12/20/will-all-our-drugs-come-from-china.html">Alex Telford</a>. &#8220;Today, they&#8217;re just as likely to license a molecule from a Chinese company. Chinese companies will often run the phase I trial in China for cheap, then flip it to a Western [pharmaceutical company] to run the expensive US trials and bring the drug to market."</p><p>The reason for this shift is due, in part, to policy. Chinese regulators have passed reforms that lower barriers to market entry and streamline approvals. Those reforms could hold lessons for U.S. regulators hoping to speed up drug development, too.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.asimov.press/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Sign up for Asimov Press. It will always be free.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Impacts from China&#8217;s reforms can be seen by looking at clinical trial enrollment data.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> Such data is illuminating because enrollment numbers offer a measure of the administrative burden placed on new treatments as they move from concept to real-world testing. Patent filings and R&amp;D budgets suggest how much a region invests in discovery, but clinical trial activity shows how projects navigate bureaucratic landscapes. By analyzing how enrollment rates respond to changes in policy, funding, and disease targets, we can see which approaches most effectively accelerate pharmaceutical innovation. Here&#8217;s what the data shows.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SFIQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SFIQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png 424w, https://substackcdn.com/image/fetch/$s_!SFIQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png 848w, https://substackcdn.com/image/fetch/$s_!SFIQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!SFIQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SFIQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png" width="1240" height="1232" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1232,&quot;width&quot;:1240,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:230467,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/161109218?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SFIQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png 424w, https://substackcdn.com/image/fetch/$s_!SFIQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png 848w, https://substackcdn.com/image/fetch/$s_!SFIQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!SFIQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8b6b2c5-4673-430e-9412-a5f4c94eb7f3_1240x1232.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the early 2010s, the number of clinical trials performed by American companies increased steadily but then leveled out at about 1,900 studies each year. China&#8217;s clinical trial numbers, on the other hand, remained relatively low until the mid-2010s. After the government <a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC8301550/">streamlined approval policies</a>, though, the number of clinical trials soared. Chinese companies matched the American clinical trial volume in the span of a few years while sustaining their average enrollment trajectory (at a marginally higher rate).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!duqE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!duqE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png 424w, https://substackcdn.com/image/fetch/$s_!duqE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png 848w, https://substackcdn.com/image/fetch/$s_!duqE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png 1272w, https://substackcdn.com/image/fetch/$s_!duqE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!duqE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png" width="1456" height="1184" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1184,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:122766,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/161109218?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!duqE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png 424w, https://substackcdn.com/image/fetch/$s_!duqE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png 848w, https://substackcdn.com/image/fetch/$s_!duqE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png 1272w, https://substackcdn.com/image/fetch/$s_!duqE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00702b90-e855-46f9-a3cf-b50eeed75ceb_1480x1204.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>China&#8217;s jump in clinical trial enrollments doesn&#8217;t stem from a small number of large, late-phase trials (which would skew the mean), either. Rather, the <a href="https://www.nature.com/articles/d41573-024-00120-5#:~:text=In%20the%202024%20pipeline%2C%20465,5).">number of </a><em><a href="https://www.nature.com/articles/d41573-024-00120-5#:~:text=In%20the%202024%20pipeline%2C%20465,5).">original</a></em><a href="https://www.nature.com/articles/d41573-024-00120-5#:~:text=In%20the%202024%20pipeline%2C%20465,5).">, </a><em><a href="https://www.nature.com/articles/d41573-024-00120-5#:~:text=In%20the%202024%20pipeline%2C%20465,5).">new </a></em><a href="https://www.nature.com/articles/d41573-024-00120-5#:~:text=In%20the%202024%20pipeline%2C%20465,5).">drugs</a> originating in China has <a href="https://www.citeline.com/-/media/citeline/resources/pdf/white-paper_annual-pharma-rd-review-2024.pdf">climbed</a> from almost zero in 2010 to a figure, in 2023, that approaches American totals.&nbsp;</p><p>How did they do it?&nbsp;</p><p>Chinese regulators introduced several measures to speed up clinical trial approvals, including priority review and conditional approvals for new drugs. Drugs that qualify for priority review often address a critical, unmet clinical need, allowing them to go through an accelerated evaluation timeline.</p><p>In 2017, the NMPA (China&#8217;s drug regulator) also launched an &#8220;implied license&#8221; policy, which automatically authorizes a clinical trial if regulators voice no objections within 60 days. That same year, China joined the International Council for Harmonisation (ICH) and updated its rules to accept overseas clinical trial data, reducing the need to repeat entire studies within China. This means that companies no longer need to repeat full trials in China if high-quality foreign results exist. Amgen&#8217;s <a href="https://www.amgen.cn/en/media/amgen-denosumab.html">XGEVA</a>, a medication used to treat bone cancer, was approved in 2019 without further testing requirements based on a global Phase 2 study that included no clinical trial sites in China. Easier approvals and commercially viable results in multiple markets also led to <a href="https://www.ft.com/content/f76c2e6b-dcc4-4e2c-a007-b53330226a5f">international investment</a>, which furthered the country&#8217;s biotech boom.</p><p>These reforms may help to explain why the number of Chinese clinical trials tripled between 2017 and 2023, from around 600 per year to nearly 2,000. Nor has the increased number of Chinese trials come at the expense of fewer enrollments.</p><p>A <a href="https://www.iqvia.com/insights/the-iqvia-institute/reports-and-publications/reports/global-trends-in-r-and-d-2023">2023 IQVIA report</a> notes that industry-wide trial complexity declined after pandemic-era peaks, favoring numerous smaller studies. This trend is evident in the U.S. data, where over three-quarters of recent trials now enroll fewer than 100 participants. The result is that many American trials are smaller and underpowered, sometimes due to difficulties in recruiting participants. The <a href="https://www.thelancet.com/journals/lanonc/article/PIIS1470-2045(23)00344-3/abstract">LUNAR trial</a> for lung cancer was forced to curtail its enrollment by half due to &#8220;slow accrual,&#8221; for example. More than 40 percent of clinical trials in China, by contrast, have high enrollment levels.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cVQf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cVQf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png 424w, https://substackcdn.com/image/fetch/$s_!cVQf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png 848w, https://substackcdn.com/image/fetch/$s_!cVQf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png 1272w, https://substackcdn.com/image/fetch/$s_!cVQf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cVQf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png" width="1456" height="1791" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1791,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:523630,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/161109218?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cVQf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png 424w, https://substackcdn.com/image/fetch/$s_!cVQf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png 848w, https://substackcdn.com/image/fetch/$s_!cVQf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png 1272w, https://substackcdn.com/image/fetch/$s_!cVQf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa11f12d-2407-453b-a944-7f1c398d2e27_1598x1966.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ultimately, the clinical trial data tells a story of two paths: an American system that pioneered modern pharmaceutical development but now exhibits signs of plateau, and a rapidly ascending Chinese system that reformed its processes to maximize efficiency.</p><p>&#8220;Progress in biopharma is ultimately driven by a fast feedback loop of human data collection,&#8221; Telford writes. &#8220;China&#8217;s regulatory reforms have made it faster and cheaper to get drugs into humans, and the learning rate of the Chinese biopharma ecosystem seems a lot higher at the moment than the US or European ones.&#8221;</p><p>Recent proposals from <a href="https://ifp.org/the-case-for-clinical-trial-abundance/">The Clinical Trial Abundance Initiative</a> reinforce this broader lesson, demonstrating that &#8220;democratizing&#8221; clinical research trials &#8212; through measures like expanding Medicaid coverage to draw more participants, eliminating unnecessary administrative burdens by simplifying paperwork, and allowing fair compensation &#8212; can increase both the speed and inclusivity of clinical trials. These plans face an uphill battle, though, especially given the ongoing funding constraints and polarized attitudes toward agencies like the NIH and FDA.</p><p>Still, these reforms reflect distinctly American roots, targeting domestic barriers such as fragmented insurance systems and institutional red tape. In contrast, China&#8217;s progress has leaned on centralized coordination, streamlined approval pathways, and top-down incentives for hospital participation.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> Though different in method, both models show how aligning regulatory structures with participation incentives can help unlock trial capacity and volume at scale.</p><p>Countries that adapt their frameworks to expedite start-up times, recognize credible foreign data, and balance the tradeoff between sufficient oversight and innovation-friendly policies stand to capture the next wave of drug discoveries. <a href="https://pharmasug.org/proceedings/china2019/DS/Pharmasug-China-2019-DS58.pdf">Japan</a>, <a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC7032966/">South Korea</a>, and <a href="https://biorasi.com/insight/spotlight-on-india-regulatory-enhancements-modernization-and-robust-data-collection-define-india-as-a-potential-clinical-trial-hub/#:~:text=In%20truth%2C%20while%20some%20Asian,clinical%20research%20ecosystem%20for%20India.">India</a> are all following China&#8217;s example. The U.S. should be among them.</p><div><hr></div><p>Thanks to Tony Kulesa for reading a draft of this.</p><p><strong>Hiya Jain </strong>is graduating from Columbia University with training in history and neuroscience. She writes about the history of science on her Substack, <a href="https://hiyajain.substack.com/">Mundane Beauty</a>.&nbsp;</p><p><strong>Cite: </strong>Jain, H. &#8220;China&#8217;s Clinical Trial Boom.&#8221; <em>Asimov Press </em>(2025). DOI: 10.62211/56hr-91hg</p><p>Lead image by Ella Watkins-Dulaney.</p><p><strong>Correction: </strong>An earlier version of this article misstated the length of time required for new drug approvals in China (<em>thanks to Egan Peltan for flagging the error</em>.) An additional note explaining the large spike in U.S. trials between 2016 and 2017 has also been added.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>All the data in this article is from ClinicalTrials.gov. Therefore, these charts likely underestimate the actual number of clinical trials by Chinese companies. Chinese clinical trial data has been available on the ClinicalTrials.gov platform since 2005, with the exception of multinational trials with sites in China.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>&#8220;China has a wealth of treatment-na&#239;ve patients in therapeutic areas where U.S. trials struggle to recruit, including immune-oncology, NASH, chronic diseases, and many orphan indications,&#8221; according to medical writer, <a href="https://www.clinicalleader.com/doc/report-reflects-huge-growth-of-clinical-trials-in-china-0001">Ed Miseta</a>. &#8220;Those patients are concentrated in top urban medical centers with direct costs that are 30 percent&nbsp;lower than in the U.S. This can lead to a patient recruitment effort that is 2-3 times faster.&#8221;</p></div></div>]]></content:encoded></item><item><title><![CDATA[Recipe for a Cell]]></title><description><![CDATA[While we know how to break organisms down to their constituent parts, even at the atomic level, building them from scratch remains difficult.]]></description><link>https://www.asimov.press/p/cell-recipe</link><guid isPermaLink="false">https://www.asimov.press/p/cell-recipe</guid><dc:creator><![CDATA[Niko McCarty]]></dc:creator><pubDate>Thu, 10 Apr 2025 18:20:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!YDwr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is the second &#8220;Data Brief&#8221; in a new series quantifying biology. All the datasets used to create this article are freely available on <a href="https://github.com/Asimov-Press/Bio-Data/tree/main/what_cells_made_from">GitHub</a>.</em></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YDwr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YDwr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png 424w, https://substackcdn.com/image/fetch/$s_!YDwr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png 848w, https://substackcdn.com/image/fetch/$s_!YDwr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png 1272w, https://substackcdn.com/image/fetch/$s_!YDwr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YDwr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png" width="1456" height="917" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:917,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4098406,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/160668855?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YDwr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png 424w, https://substackcdn.com/image/fetch/$s_!YDwr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png 848w, https://substackcdn.com/image/fetch/$s_!YDwr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png 1272w, https://substackcdn.com/image/fetch/$s_!YDwr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d783cd-ec32-408e-bd95-87b04939cf94_2000x1260.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Biologists have long pondered whether life can be reduced to discrete chemical components or, conversely, whether those same components could be used to assemble an organism from scratch. Perhaps in the future, human hands might tailor a cell by stitching together individual molecules with the world&#8217;s finest sewing machine. But for now, this remains a distant dream.</p><p>Fortunately, cells can accomplish on their own what humans can&#8217;t. Living organisms routinely assemble new lifeforms from scratch (a.k.a. their offspring) by converting raw inputs like sugars, nitrogen, and energy into highly ordered structures, beating back the unceasing tides of entropy. Or, as the Austrian physicist <a href="https://en.wikipedia.org/wiki/Erwin_Schr%C3%B6dinger">Erwin Schr&#246;dinger</a> wrote in his 1944 classic, <em><a href="https://en.wikipedia.org/wiki/What_Is_Life%3F">What Is Life?</a></em>, cells refuse to succumb to chaos &#8220;by eating, drinking, breathing and (in the case of plants) assimilating.&#8221;&nbsp;</p><p>While we humans can&#8217;t yet perfectly replicate cellular function, we can engineer, co-opt, and <em>coerce</em> cells<em> </em>to do so on our behalf. The entire premise of synthetic biology is that cells are &#8220;programmable&#8221; machines capable of nano-scale fabrication, and that, furthermore, one can engineer those cells to create medicines, foods, and materials.</p><p>Any effort to engineer a cell, though, should begin with a catalog of its parts. What are a cell&#8217;s necessary components? How many atoms &#8212; and how much energy &#8212; does it take to build a new one?</p><p>It&#8217;s easiest to answer these questions for the bacterium, <em>Escherichia coli</em>, because that microbe has been studied more than any other. It is the subject of more than 100,000 published research papers. Scientists have marveled at <a href="https://journals.asm.org/doi/10.1128/jcm.00491-10">&#8220;flesh-eating&#8221; </a><em><a href="https://journals.asm.org/doi/10.1128/jcm.00491-10">E. coli</a></em> and engineered strains that <a href="https://pubmed.ncbi.nlm.nih.gov/16306980/">act as living &#8220;camera film.&#8221;</a></p><p>But even using a single, well-characterized species like <em>E. coli</em> as a guide, there&#8217;s no one-size-fits-all cellular template. The size and shape of each bacterium changes a great deal depending on its environment. When <em>E. coli </em>cells have abundant access to nutrients, they tend to be larger. If nutrients are scant and conditions are poor, they become smaller and divide more slowly. In general, however, the mass of an <em>E. coli </em>is about 1 picogram, or one one-trillionth of a gram. That&#8217;s astonishingly small, roughly equal to the <a href="https://royalsocietypublishing.org/doi/abs/10.1098/rspb.2009.1004">weight of the DNA inside a single hummingbird cell</a>.</p><p>Which molecules make up this mass?&nbsp;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.asimov.press/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Sign up for Asimov Press.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>About 70 percent is just water. The other 30 percent &#8212; the so-called &#8220;dry mass&#8221; &#8212; is composed of everything else: DNA, RNA, proteins, lipids, and ions. Surprisingly, DNA accounts for just 3 percent of this dry mass, yet encodes all the information needed for the cell to grow, divide, adapt, and evolve. Most of a cell&#8217;s dry mass (55 percent) is proteins, which dominate because they are the &#8220;executors&#8221; of cellular functions; they catalyze reactions, form structural scaffolds, and regulate gene expression. About 20 percent of the dry mass is RNA, 10 percent is lipids, and the rest is ions and signaling molecules.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e24l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e24l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png 424w, https://substackcdn.com/image/fetch/$s_!e24l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png 848w, https://substackcdn.com/image/fetch/$s_!e24l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png 1272w, https://substackcdn.com/image/fetch/$s_!e24l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e24l!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png" width="1200" height="745.054945054945" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:904,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:103693,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/160668855?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e24l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png 424w, https://substackcdn.com/image/fetch/$s_!e24l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png 848w, https://substackcdn.com/image/fetch/$s_!e24l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png 1272w, https://substackcdn.com/image/fetch/$s_!e24l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F401c0b41-f792-41ad-b28e-fad334825209_3600x2235.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It&#8217;s worthwhile to pause for a moment and consider <em>why </em>proteins form the bulk of a cell&#8217;s dry mass. And the simple fact is that there are lots of them, and each protein is relatively heavy. A single <em>E. coli </em>cell packs upwards of two million proteins into its tiny body, and each protein contains dozens to hundreds of amino acids. A typical 300-amino-acid protein weighs around 33 kilodaltons (kDa), equivalent to the mass of 50 base pairs of DNA.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><p>A small number of protein types also account for the vast majority of all proteins in a cell. In 1978, researchers believed that EF-Tu &#8212; a protein that works with ribosomes to synthesize proteins &#8212; was most abundant, with <a href="https://www.annualreviews.org/content/journals/10.1146/annurev.bi.47.070178.002405">a few hundred thousand copies</a> per cell. Then, in 1979, a paper in <em>Cell</em> made the claim (still held today) that the most abundant protein is Lpp, with <a href="https://www.cell.com/cell/pdf/0092-8674(79)90224-1.pdf">over 700,000 copies per </a><em><a href="https://www.cell.com/cell/pdf/0092-8674(79)90224-1.pdf">E. coli</a></em><a href="https://www.cell.com/cell/pdf/0092-8674(79)90224-1.pdf"> cell</a>. Lpp holds the outer membrane to the peptidoglycan layer, preventing cells from collapsing. Without this protein, a cell would be crushed by its turbulent surroundings.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><div id="datawrapper-iframe" class="datawrapper-wrap outer" data-attrs="{&quot;url&quot;:&quot;https://datawrapper.dwcdn.net/hmMl2/7/&quot;,&quot;thumbnail_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b291caff-7270-4fd5-8568-1eb2041fba9b_1260x660.png&quot;,&quot;thumbnail_url_full&quot;:&quot;&quot;,&quot;height&quot;:940,&quot;title&quot;:&quot;Abundance of proteins in E. coli, colored by cellular function&quot;,&quot;description&quot;:&quot;Concentration of more than 2,300 proteins in E. coli bacteria grown in media containing glucose at 42&#176;C. The most abundant protein in this dataset is TufA (also known as elongation factor Tu, or EF-Tu), which aids in protein synthesis.&quot;}" data-component-name="DatawrapperToDOM"><iframe id="iframe-datawrapper" class="datawrapper-iframe" src="https://datawrapper.dwcdn.net/hmMl2/7/" width="730" height="940" frameborder="0" scrolling="no"></iframe><script type="text/javascript">!function(){"use strict";window.addEventListener("message",(function(e){if(void 0!==e.data["datawrapper-height"]){var t=document.querySelectorAll("iframe");for(var a in e.data["datawrapper-height"])for(var r=0;r<t.length;r++){if(t[r].contentWindow===e.source)t[r].style.height=e.data["datawrapper-height"][a]+"px"}}}))}();</script></div><p>All of this points to a final question: Namely, how much energy would it take to engineer these components from scratch?</p><p>One might assume that it would take a lot. After all, there are millions of proteins per cell, each crafted by stitching together individual amino acids into long chains. It requires a great deal of energy to make each amino acid and then combine them. Similar considerations apply for DNA, RNA, and all the other molecules, too.</p><p>But the <em>actual </em>amount of energy required is surprisingly small. Scientists recently used <a href="https://www.nature.com/articles/s41598-024-54303-6">computational models</a> to estimate the energy cost of assembling every part of a single cell. They first used experimental data to quantify <em>E. coli</em>&#8217;s molecular composition, and then applied a <a href="https://en.wikipedia.org/wiki/Group-contribution_method">group-contribution algorithm</a> to calculate the standard Gibbs free energy required to form each type of biomolecule. By adding together all these calculations, the researchers arrived at the minimum energy required to build an <em>E. coli</em>: 9.54 x 10<sup>-11 </sup>joules.</p><p>Interestingly, constructing the lipid bilayer &#8212; the membrane that partitions a cell from its environment &#8212; turns out to be the second most energetically expensive part of the cell to create, even though lipids account for less than 10 percent of dry mass. This is because lipids are made from long carbon strings fused together via high-energy ester linkages, which requires more Gibbs free energy per gram to synthesize than any other type of biomolecule (even proteins).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jdE8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jdE8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png 424w, https://substackcdn.com/image/fetch/$s_!jdE8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png 848w, https://substackcdn.com/image/fetch/$s_!jdE8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png 1272w, https://substackcdn.com/image/fetch/$s_!jdE8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jdE8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png" width="1456" height="2186" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2186,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:252121,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.asimov.press/i/160668855?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jdE8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png 424w, https://substackcdn.com/image/fetch/$s_!jdE8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png 848w, https://substackcdn.com/image/fetch/$s_!jdE8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png 1272w, https://substackcdn.com/image/fetch/$s_!jdE8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc753a597-5878-4d72-aeea-7b44113304c3_1640x2462.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>While we now have a nearly complete catalog of <em>E. coli&#8217;s</em> molecular parts &#8212; and estimates of the energy required to assemble them &#8212; building a functioning cell from scratch still remains one of biology&#8217;s most elusive goals. The challenge is not merely mechanical, either. Even if one gathers all the right molecules in the right quantities, arranging those molecules with sufficient precision to &#8220;create life&#8221; remains far beyond current capabilities.&nbsp;</p><p>This is, in part, because cells are <em>dynamic</em>. They move and change quickly. Sugar molecules fly through a cell at 250 miles per hour, and each protein is bombarded by <a href="https://www.asimov.press/p/fast-biology">10<sup>13</sup> water molecules</a> every second. This dynamism is often overlooked in the methods used to study living cells. A list of ingredients is inherently static.&nbsp;The next step, then, is to capture how each piece moves and interacts. Only then might we turn our catalogs of parts into recipes for a cell.</p><div><hr></div><p><strong>Niko McCarty </strong>is a founding editor of <em>Asimov Press</em>.</p><p><strong>Cite: </strong>McCarty N. &#8220;Recipe for a Cell.&#8221; <em>Asimov Press </em>(2025). DOI: 10.62211/34pk-99hr</p><p>Lead image by Ella Watkins-Dulaney.</p><p><strong>Correction: </strong>An earlier version of this article included an incorrect calculation on the energy required to build an <em>E. coli </em>cell.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Once we know the macromolecules that make up an <em>E. coli </em>cell, we can also estimate the raw elements &#8212; or even chemical formula &#8212; used to build such a cell. A typical <em>E. coli </em>cell contains about 10<sup>10 </sup>carbon atoms and has a chemical formula of C<sub>4.4</sub>&#8203;H<sub>7.2&#8203;</sub>O<sub>2.1</sub>&#8203;N<sub>0.8&#8203;</sub>P<sub>0.086&#8203;</sub>S<sub>0.039</sub>&#8203;. This means that for every 4.4 carbon atoms, the cell has 7.2 hydrogens, 2.1 oxygens, 0.8 nitrogens, and a tiny fraction of sulfur and phosphorus. This formula is derived by averaging the approximate composition of proteins, nucleic acids, lipids, carbohydrates, and other cell components.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>One kilodalton is equal in mass to 83 carbon-12 atoms.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>A <a href="https://www.science.org/doi/10.1126/sciadv.add8659">2023 paper</a> in <em>Science Advances</em> used atomic force microscopy to visualize Lpp in individual cells and concluded that each cell contains hundreds of thousands to about one million copies.</p></div></div>]]></content:encoded></item><item><title><![CDATA[What Limits a Cell’s Size?]]></title><description><![CDATA[Two physical constraints help explain why cells are so tiny: surface area-to-volume ratios and diffusion. The first article in our new Data Series.]]></description><link>https://www.asimov.press/p/cell-size</link><guid isPermaLink="false">https://www.asimov.press/p/cell-size</guid><dc:creator><![CDATA[Niko McCarty]]></dc:creator><pubDate>Wed, 26 Mar 2025 14:33:07 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4a593dfa-7da2-449d-954c-2a88dbd97464_2000x1260.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is the first &#8220;Data Brief&#8221; in a new series quantifying biology. Please send feedback and corrections to <a href="mailto:editors@asimov.com">editors@asimov.com</a>. All the datasets created for this article are freely available on <a href="https://github.com/Asimov-Press/Bio-Data/tree/main/what_limits_cell_size">GitHub</a>.</em></p><div><hr></div><p>A human body is built from 30 trillion cells &#8212; excluding microbes &#8212; that each arise from a lone, fertilized egg. These cells come in a multiplicity of shapes and sizes, with internal volumes spanning five orders of magnitude. The smallest human cell, a sperm, fills a volume of just <a href="https://bionumbers.hms.harvard.edu/bionumber.aspx?id=109891&amp;ver=1&amp;trm=sperm+volume&amp;org=">30 &#181;m&#179;</a>, whereas an oocyte boasts a volume of <a href="https://bionumbers.hms.harvard.edu/bionumber.aspx?id=101664&amp;ver=16&amp;trm=oocyte+volume&amp;org=">4,000,000 &#181;m&#179;</a>, making it the largest cell in the human body.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><p>What accounts for this huge range? A simplistic answer would be that evolution has simply made each cell whatever size best suits its function. Under this explanation, sperm are small because the body produces many of them, and tinier cells cost less energy to make. Sperm are also ultra-minimal, stripped down to little more than DNA and the few mitochondria necessary for providing energy to spin their whip-like tails. By contrast, an oocyte needs massive reserves of mitochondria and nutrients to support early embryonic growth. In short, every cell is as large or small as it needs to be &#8212; <em>within reason</em>.</p><p>But we can derive far more satisfying answers from physics.&nbsp;</p><p>One major limitation on a cell&#8217;s size is <strong>surface area-to-volume ratio</strong>. Assuming that a cell is roughly spherical in shape, its internal volume grows proportionally to the <em>cube </em>of the sphere&#8217;s radius, and its surface area grows proportionally to the <em>square </em>of that radius. In other words, a cell&#8217;s volume grows much more swiftly than its surface area.</p><p>This ratio has serious consequences for cellular survival. The cell&#8217;s membrane funnels nutrients into the cell and secretes waste. It&#8217;s also where the energy in a prokaryotic cell &#8212; like <em>E. coli </em>&#8212; gets made. If the interior grows too large relative to the membrane, the cell&#8217;s metabolic processes slow to a crawl.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MN_h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MN_h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png 424w, https://substackcdn.com/image/fetch/$s_!MN_h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png 848w, https://substackcdn.com/image/fetch/$s_!MN_h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png 1272w, https://substackcdn.com/image/fetch/$s_!MN_h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MN_h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png" width="1456" height="1096" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1096,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:130013,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MN_h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png 424w, https://substackcdn.com/image/fetch/$s_!MN_h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png 848w, https://substackcdn.com/image/fetch/$s_!MN_h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png 1272w, https://substackcdn.com/image/fetch/$s_!MN_h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde130136-9b6c-4f79-bd21-2e7fc19373ec_1480x1114.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A second constraint is <strong>diffusion</strong>, or the tendency for molecules to migrate from areas of high concentration to areas of lower concentration. This migration dictates how quickly enzymes find substrates, how effectively signaling molecules reach receptors, and how often ribosomes collide with messenger RNAs. Inside a cell, nearly everything happens by chance encounters amongst these molecules. As a cell&#8217;s volume grows, the chance for any given encounter diminishes (assuming the numbers of molecules stay constant).&nbsp;</p><p>A molecule&#8217;s diffusion rate hinges on several factors. For instance, the cytoplasm is extremely crowded, and so molecules spend lots of time ricocheting off obstacles, delaying their arrival at a distant location. Every protein in a cell collides with about 10 billion water molecules per second, on average. These frequent collisions mean that the vast majority of proteins in a bacterium only diffuse between 5 and 10 &#181;m<sup>2</sup> per second. Some molecules also aggregate, or stick to charged surfaces, further slowing their movement. (Molecules move very slowly <a href="https://www.science.org/doi/10.1126/sciadv.abo5387">near the densely-packed nucleoid</a>, for example.) In general, larger molecules diffuse more sluggishly than smaller ones. </p><p>Metabolites in <em>E. coli</em> can diffuse from one side of the cell to the other in milliseconds, which means collisions &#8212; and cellular <em>outcomes </em>&#8212; occur quickly. A typical protein takes just <a href="https://bionumbers.hms.harvard.edu/bionumber.aspx?id=103801">0.01 seconds</a> to traverse a bacterium&#8217;s diameter (about 1 micrometer), but the <em>same </em>protein would take around four minutes to move one millimeter and more than <em>six hours</em> to move one centimeter. This is, in part, why cells are so tiny.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9PM3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9PM3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png 424w, https://substackcdn.com/image/fetch/$s_!9PM3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png 848w, https://substackcdn.com/image/fetch/$s_!9PM3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png 1272w, https://substackcdn.com/image/fetch/$s_!9PM3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9PM3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png" width="1456" height="1192" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1192,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:208280,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9PM3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png 424w, https://substackcdn.com/image/fetch/$s_!9PM3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png 848w, https://substackcdn.com/image/fetch/$s_!9PM3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png 1272w, https://substackcdn.com/image/fetch/$s_!9PM3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe71bd59e-575c-4fa7-bede-99520a53824a_1480x1212.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>With these constraints in mind, we can begin to speculate as to why various cells are shaped the way they are. Red blood cells are tiny and shaped like biconcave discs to aid with diffusion; by abandoning a spherical shape and evolving more toward a &#8220;donut,&#8221; they increase their surface area without compromising their compact volume. This, in turn, enhances their ability to exchange oxygen with cells in the body. Their small size (just 8 micrometers across) also helps them move through narrow capillaries.</p><p>In contrast, oocytes can grow so large (around 100 micrometers in diameter), in part, because they are less metabolically active than other types of human cells &#8212; and thus don&#8217;t depend so much on &#8220;random&#8221; collisions. They stockpile nutrients during oogenesis to &#8220;wait out&#8221; fertilization. Eukaryotic cells also grow large, in general, because they&#8217;ve evolved <em>compartmentalization</em>; by modularizing specific functions into organelles, they bring molecules closer together to help get the job done.</p><p>Cell sizes are not fixed, however, even within a single species. Cells often swell as they increase their production of proteins and metabolites in preparation for division. This is in line with biology&#8217;s only rule: namely, there are exceptions to every rule!&nbsp;</p><p>Case in point: a giant bacterium called <em>Thiomargarita magnifica</em> can exceed one centimeter in diameter, so large that it can be seen by the naked eye. It does so by breaking the surface area-to-volume rule, filling between 65-80 percent of its internal volume with an empty vacuole. In other words, it pushes most of its &#8220;working&#8221; molecules to the cell periphery, thus shortening diffusion distances.</p><p>Despite their variety, these architectures still hinge on molecules bumping into each other, guided by the immutable laws of physics. Or, as D&#8217;Arcy Wentworth Thompson mused in <em>On Growth and Form</em> (1917), &#8220;The form of an object is a &#8216;diagram of forces.&#8217;&#8221; Cells bear witness to both internal and external forces; they are constrained by diffusion and shaped by the delicate trade-off between volume and surface area.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.asimov.press/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Sign up for Asimov Press.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p><strong>Niko McCarty </strong>is a founding editor of <em>Asimov Press</em>.</p><p>Thanks to Ben Andrew and Ashish Uppala for reviewing drafts and contributing data.</p><p><strong>Cite: </strong>McCarty, N. &#8220;What Limits a Cell&#8217;s Size?&#8221; <em>Asimov Press </em>(2025). DOI: 10.62211/97to-41re</p><p>Lead image by <a href="https://ellawatkinsdulaneyphd.com">Ella Watkins-Dulaney</a>.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>I find it difficult to wrap my head around large numbers and have found it helpful to rethink them as real-world analogies. To understand the difference between a million and a billion, for example, one could think about time. If a worker earned $100 per hour, it would take them 416 days of non-stop work to make $1 million and more than 1,100 non-stop years of work to make $1 billion. That&#8217;s a big difference! Similarly, when thinking about biological numbers, it&#8217;s helpful to scale things up and recontextualize them. Imagine a sperm &#8220;blown up&#8221; to the size of a glass marble, about two cubic centimeters in volume. A single oocyte, scaled in the same way, would then occupy roughly the size of a refrigerator &#8212; more than 100,000-times larger.</p></div></div>]]></content:encoded></item><item><title><![CDATA[The Making of a Gene Circuit]]></title><description><![CDATA[An interactive visualization of the repressilator, a genetic circuit that gave rise to synthetic biology.]]></description><link>https://www.asimov.press/p/gene-circuit</link><guid isPermaLink="false">https://www.asimov.press/p/gene-circuit</guid><dc:creator><![CDATA[Niko McCarty]]></dc:creator><pubDate>Sun, 16 Feb 2025 17:39:14 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/69dd41f7-0aae-4667-b1c3-8181b026c02e_2000x1260.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the late 1990s, a young physicist named Michael Elowitz decided to &#8220;program&#8221; living cells.&nbsp;</p><p>A graduate student at Princeton University, Elowitz was spending a great deal of his free time poring over esoteric papers about circadian clocks, the molecular networks that control an organism&#8217;s behavior in roughly 24-hour cycles.</p><p>&#8220;And as I&#8217;m reading all of these papers,&#8221; he <a href="https://press.asimov.com/articles/synthetic-origins">told </a><em><a href="https://press.asimov.com/articles/synthetic-origins">Asimov Press</a></em>, &#8220;I noticed that many of them concluded with a cartoon model of the biological circuit deduced from the genetic or biochemical measurements in the paper. What I most remember is just feeling like, are these circuit sketches really sufficient to explain the behavior? Or are they just a summary of observed interactions, possibly omitting many other critical components? It was driving me crazy.&#8221;</p><p>Richard Feynman&#8217;s iconic admonition &#8212; &#8220;What I cannot create, I do not understand&#8221; &#8212; resonated with Elowitz, who decided to answer his own questions by building a <em>synthetic </em>molecular clock; one not found anywhere in nature. Inspired by his readings, Elowitz began designing an oscillator that would force living cells to flash on and off in a periodic rhythm. However, even despite initial enthusiasm, doubts quickly mounted.</p><p>&#8220;When I asked people what they thought of the project,&#8221; he said, &#8220;I got very different answers. A few well-known biologists would say, &#8216;No, it&#8217;ll never work that way. It just won&#8217;t work.&#8217; And I&#8217;d ask them, &#8216;Why won&#8217;t it work?&#8217; And they&#8217;d say, &#8216;Biology just doesn&#8217;t really work that way. You can&#8217;t predict what&#8217;s going to happen.&#8217;&#8221;</p><p>By 2000, though, Elowitz had proven those doubters wrong. Alongside physicist Stanislas Leibler, he <a href="https://www.nature.com/articles/35002125">published his findings</a> in <em>Nature</em>; the duo had successfully created a biological oscillator, endowing living cells with an artificial rhythm.<em> </em>That <em>Nature </em>paper appeared back-to-back with another report of a<em> </em>synthetic gene circuit, called the &#8220;<a href="https://www.nature.com/articles/35002131">toggle switch</a>,&#8221; developed by Jim Collins, Timothy Gardner, and Charles Cantor at Boston University. Neither group knew about the other&#8217;s work,<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> but together they launched the field of synthetic biology.</p><p>Technologies to &#8220;program biology&#8221; have come a long way since the repressilator was first introduced twenty-five years ago. Synthetic biologists have recently designed interacting protein clusters that <a href="https://www.science.org/doi/10.1126/science.add8468">act as neural networks</a> inside living cells, gene circuits that can switch between OR and AND logic gates based on <a href="https://www.nature.com/articles/s41467-022-33288-8">small molecule triggers</a>, and even programmed a community of cells to execute a hashing function widely used in cryptography.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><p>Many modern gene circuits, though, are incredibly complicated. They are depicted in research papers as dense tangles of arrows, triangles, and other symbols; similar to the diagrams found in electronic engineering textbooks. The repressilator offers a useful starting point to begin parsing this complexity. After all, the same design approach that Elowitz used to build his oscillator still provides the basics for assembling modern gene circuits. By understanding how the repressilator was made and how it really works, anyone can grasp the basic principles upon which synthetic biology was built.</p><p>With that in mind, we created an <a href="https://observablehq.com/d/b24094a4030612b6">interactive chart</a> that visualizes the repressilator&#8217;s dynamics &#8212; and allows you to play around with them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cdVf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cdVf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png 424w, https://substackcdn.com/image/fetch/$s_!cdVf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png 848w, https://substackcdn.com/image/fetch/$s_!cdVf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png 1272w, https://substackcdn.com/image/fetch/$s_!cdVf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cdVf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png" width="725.140625" height="364.1675385462555" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:798,&quot;width&quot;:1589,&quot;resizeWidth&quot;:725.140625,&quot;bytes&quot;:222674,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cdVf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png 424w, https://substackcdn.com/image/fetch/$s_!cdVf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png 848w, https://substackcdn.com/image/fetch/$s_!cdVf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png 1272w, https://substackcdn.com/image/fetch/$s_!cdVf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2129647d-3840-4c7c-9601-6a49d6a78c8f_1589x798.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Explore the repressilator&#8217;s dynamics by visiting the <a href="https://press.asimov.com/articles/gene-circuit">Asimov Press website</a> or viewing our <a href="https://observablehq.com/d/b24094a4030612b6">ObservableHQ notebook</a>.</figcaption></figure></div><h2>Building a Circuit</h2><p>Much like electronic circuits, which use wires and transistors to create or process signals, a synthetic gene circuit is a network of biological parts &#8212; DNA, RNA, and proteins &#8212; that process inputs and generate outputs. A working gene circuit can detect many different inputs, such as a molecule, flash of light, or a physical force, and then translate those signals into everything from making a fluorescent protein to emitting an odor molecule. For an oscillator, the output is simply a pattern of on-off signaling that repeats over time, akin to a blinking light on a circuit board.</p><p>Elowitz&#8217;s oscillator design was made of just three genes &#8212; <em>lacI</em>, <em>tetR</em>, and <em>cI</em> &#8212; linked together in a negative feedback loop, reminiscent of a <a href="https://en.wikipedia.org/wiki/Ring_oscillator">three-ring oscillator</a> in electronics. Each gene encodes a repressor protein that binds to a specific DNA sequence upstream of a different repressor gene, called a <em>promoter</em>, and blocks RNA polymerase from transcribing that gene.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.asimov.press/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe to Asimov Press.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>In the repressilator, LacI blocks <em>tetR</em>, TetR shuts down <em>cI</em>, and cI represses <em>lacI</em>. When LacI is abundant, it switches off the <em>tetR</em> gene. But LacI eventually degrades or falls off the DNA, allowing <em>tetR</em> to turn on and make TetR proteins that then repress <em>cI</em>. And so on. This cyclical process is ultimately what drives the cell&#8217;s oscillations.</p><p>To confirm these oscillations, Elowitz added a separate reporter plasmid encoding green fluorescent protein (GFP) to his cells. This reporter uses a DNA sequence that TetR recognizes, so when TetR levels rise, TetR blocks the GFP gene and cells stay dark. When TetR levels drop, the cells flash a brilliant green.</p><p>Elowitz inserted his assembled DNA into<em> E. coli</em> and then used a fluorescence microscope to observe the cells in real time. Unfortunately, his microscope lacked autofocus, causing cells to drift in and out of focus. He had to remain at the microscope day and night to adjust it manually before taking each photograph. In one especially grueling stretch, he slept next to the microscope with an alarm clock, waking every hour to refocus the image.</p><p>Fortunately, his work paid off. Elowitz&#8217;s repressilator caused cells to flash green about every 150 minutes, albeit with variations between cells. The results convinced skeptics that researchers could engineer predictable circuits into living cells.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Zlz-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zlz-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png 424w, https://substackcdn.com/image/fetch/$s_!Zlz-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png 848w, https://substackcdn.com/image/fetch/$s_!Zlz-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png 1272w, https://substackcdn.com/image/fetch/$s_!Zlz-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zlz-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png" width="728" height="1653.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:3307,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:809933,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zlz-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png 424w, https://substackcdn.com/image/fetch/$s_!Zlz-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png 848w, https://substackcdn.com/image/fetch/$s_!Zlz-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png 1272w, https://substackcdn.com/image/fetch/$s_!Zlz-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1416427-407f-4832-994f-2b00ebf5cdb2_2100x4770.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Modeling a Circuit</h2><p>Before building the repressilator, Elowitz needed to understand the quantitative values of various parameters that would ensure its oscillations in the cell. He modeled the repressilator using six differential equations, a standard mathematical approach for describing how molecular concentrations change over time. Each of the three genes &#8212; <em>lacI</em>, <em>tetR</em>, and <em>cI</em> &#8212; makes both mRNA and a corresponding protein, so Elowitz wrote separate equations for both molecules and for each gene.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\begin{align}\n\\frac{dm_i}{dt} &amp;= -m_i + \\frac{\\alpha}{1 + \\rho_j^n} + &#956; \\\\[6pt]\n\\frac{d\\rho_i}{dt} &amp;= -\\beta\\bigl(\\rho_i - m_i\\bigr)\n\\end{align}\n\n  \\text{where } i = \\{\\text{lacI},\\text{tetR},\\text{cI}\\}\n  \\quad\\text{and}\\quad\n  j = \\{\\text{cI},\\text{lacI},\\text{tetR}\\}.&quot;,&quot;id&quot;:&quot;MIKXJNWOBV&quot;}" data-component-name="LatexBlockToDOM"></div><p>In real cells, each gene and protein differs in subtle ways. Some repressors bind DNA more tightly than others, for example, or decay more rapidly. Elowitz simplified his mathematical model by assuming that all three genes behave identically and assigning them the same parameter values. This choice allowed him to understand what parameter regimes generally favored oscillation, while neglecting less significant effects due to differences among the three genes.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> In total, he used six parameters:</p><ol><li><p>Leakiness (&#956;) measures how much gene expression slips through when a promoter is bound by its repressor. Even when LacI binds to the tetR promoter, for example, a small amount of TetR still gets made (LacI sometimes falls off the DNA, leaving a small gap of time for RNA polymerase to bind and transcribe the gene.) Too much leakiness can ruin an oscillator by preventing it from cycling.</p></li><li><p>Promoter strength (&#945;) indicates how quickly RNA polymerase transcribes mRNA when the repressor is not bound. A &#8220;strong&#8221; promoter makes more mRNA, which typically leads to more repressor protein. If a promoter is too weak, for example, not enough of a repressor will get made to shut down the next gene in the loop, and so the rhythms stop.</p></li><li><p>Decay Rate Ratio (&#946;) compares how fast proteins degrade relative to their mRNA. Bacterial mRNA usually breaks down in a few minutes, whereas proteins stick around the cell for longer. Researchers can tweak this ratio by engineering proteins to degrade more quickly, such as by fusing them to little peptide tags &#8212; called degrons &#8212; that signal cellular proteases to break them down. Higher protein turnover rates usually speed up oscillations.</p></li><li><p>The Hill Coefficient (n) measures how sharply transcription flips between off and on once a repressor binds. A value near 1 causes gradual shifts that might not allow enough &#8220;overshoot&#8221; of protein concentrations to sustain oscillations, while higher values (around 2 or 3) create abrupt transitions that favor such overshooting, leading to more stable cycles.</p></li><li><p>m and &#961; refer to the concentrations of mRNA and protein, respectively, for each gene.</p></li></ol><p>Try sliding these parameters around in the interactive diagram to see how each parameter affects the repressilator&#8217;s dynamics. Notice how reducing &#946; makes proteins linger longer, thus stretching out each cycle. Boosting &#945; (promoter strength) increases the amount of repressors made, tweaking the amplitude of oscillations. Increasing &#956; (leakiness) past a certain point deteriorates the oscillations, or, at high enough levels, prevents it from beginning altogether.</p><p>Remember that these equations are <em>approximations</em>. This model aims to guide the design of the circuit, rather than to represent all of its molecular interactions in full detail. In reality, repressor proteins bind to DNA in a discrete and probabilistic way, but these equations approximate their behaviors as if the number of molecules in the cell were continuous. Also, RNA polymerase moves in bursts along DNA, rather than at a steady pace. Despite these limitations, these six equations provided the key insights needed to help Elowitz design an actual repressilator by stitching together genes and inserting them into microbes.</p><h2>Tuning a Circuit</h2><p>Early versions of the repressilator didn&#8217;t create a perfect rhythm in every cell. Elowitz observed oscillations of about 150 minutes, but the period varied from one cell to the next, and only around forty percent of the cells oscillated during any given movie. This variation underscored the immense challenge of coaxing thousands of individual bacterial cells to &#8220;tick&#8221; in near-unison. It also provoked the question of why a well-defined circuit, designed down to the nucleotide by a human being, behaved so differently in genetically-identical cells.&nbsp;</p><p>Over time, synthetic biologists dissected Elowitz&#8217;s results and figured out that small differences in repressor strength, protein decay rates, and cellular resources can throw the oscillator off. The cells&#8217; environment matters, too. </p><p>Elowitz grew his cells on glucose gel pads. But as the bacteria multiplied, they eventually ate up all the sugar and began swimming in their own waste, thus disrupting the circuit&#8217;s rhythm. In 2010, a group of Harvard scientists invented a device called a &#8220;<a href="https://doi.org/10.1016/j.cub.2010.04.045">mother machine</a>&#8221; to solve this problem. Their device traps individual cells in tiny wells, bathing them in fresh nutrients while washing away waste, stabilizing the repressilator&#8217;s cycles.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;690ffeac-3029-47b2-bb60-1b1037e127cc&quot;,&quot;duration&quot;:null}"></div><p>Scientists have also tweaked the repressilator at the genetic level to improve reliability. In 2016, Laurent Potvin-Trottier and colleagues at Harvard University consolidated the three repressor genes and fluorescent reporter onto a single plasmid so that each gene would be produced at comparable levels. Noticing that TetR bound its DNA target more strongly than LacI or cI, they also created a &#8220;DNA sponge&#8221; &#8212; extra binding sites that soak up excess TetR &#8212; to bring its binding strength down to a level that matched the others more closely. With these tweaks, Potvin-Trottier shrank the standard deviation of periods between cells to just 14 percent.</p><p>After the <a href="https://www.nature.com/articles/nature19841">Potvin-Trottier study</a> was published in <em>Nature</em>, Elowitz and his student, Xiaojing Gao, <a href="https://www.nature.com/articles/nature19478">fired off a response</a>; their letter strikes at the heart of the idea that one <em>can </em>achieve precision in biology using simple gene circuits (bolding our own):</p><blockquote><p>...in the most precise of Potvin-Trottier and colleagues' circuits, the standard deviation in period length was reduced from 35% of the mean to around 14%, with strikingly uniform pulse shapes and amplitudes observed. This repressilator generates a pulse of fluorescent-protein expression just once every 14 generations. Assuming a cell-cycle time of 1 hour, <strong>it would take around 7.5 days, or 180 cell cycles, for a colony of cells to accumulate a standard deviation of half a period of drift</strong>. This extraordinary precision means that even a large population of cells can remain in sync. In fact, the authors were able to visualize oscillation dynamics in a test-tube culture &#8230; Evidently, <strong>precision does not necessarily demand circuit complexity</strong> and, in this case, even seems to benefit from minimalism.</p></blockquote><p>The repressilator remains a foundational example of synthetic biology&#8217;s ability to blend mathematics with predictable outcomes. Elowitz showed that one can design gene networks that exist nowhere in nature, predict their behavior using a bit of calculus, and then actually use them to get cells to carry out new behaviors.</p><p>Today, the repressilator&#8217;s impact extends far beyond making cells pulse rhythmically. Its core ideas &#8212; negative feedback, parameter tuning, cooperativity, and molecular noise &#8212; continue to guide the design of everything from biological logic gates and memory modules to biosensors and entire metabolic pathways.</p><p>And it all started with some mathematical equations and an idea.</p><div><hr></div><p><em>Thanks to Michael Elowitz for reading a draft of this.</em></p><p><strong>Nehal Udyavar </strong>is a design engineer who creates explorable explanations of biological systems at <a href="https://www.newtinteractive.com/">Newt Interactive</a>. He shares his learning journey on his <a href="https://nehalslearnings.substack.com/">Substack</a>.</p><p><strong>Niko McCarty </strong>is a founding editor of <em>Asimov Press</em>.</p><p><strong>Cite: </strong>Udyavar N. &amp; McCarty N. &#8220;The Making of a Gene Circuit.&#8221; <em>Asimov Press </em>(2025). DOI: 10.62211/23ey-67yy</p><p>Lead image by <a href="https://ellawatkinsdulaneyphd.com">Ella Watkins-Dulaney</a>.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>There were actually four papers that all came out around the same time, but the Elowitz and Collins papers received the most attention by far. William Farmer and James Liao designed and assembled &#8220;a regulatory circuit to control gene expression in response to intracellular metabolic states,&#8221; <a href="https://www.nature.com/articles/nbt0500_533">published in </a><em><a href="https://www.nature.com/articles/nbt0500_533">Nature Biotechnology</a> </em>in May 2000, whereas Attila Becskei and Luis Serrano built an autoregulation gene circuit, <a href="https://www.nature.com/articles/35014651">published in </a><em><a href="https://www.nature.com/articles/35014651">Nature</a> </em>in June.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Specifically, the MD5 hashing function. It was built by distributing <a href="https://microbiology.mit.edu/events/microbiology-thesis-defense-jai-padmakumar-voigt-lab/">110 logic gates</a> across 65 unique <em>E. coli </em>strains.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>&#8220;Transcribing&#8221; a gene simply means that an enzyme called RNA polymerase turns DNA into RNA.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>A later study by Johan Paulsson&#8217;s group showed that a specific aspect of the TetR system &#8212; namely, its very tight binding to DNA &#8212; made the circuit more &#8220;noisy&#8221; than it otherwise would have been. So experimentally, differences between the repressors are extremely important.</p></div></div>]]></content:encoded></item></channel></rss>