{"id":674,"date":"2026-04-06T04:50:14","date_gmt":"2026-04-05T20:50:14","guid":{"rendered":"https:\/\/connectword.dpdns.org\/?p=674"},"modified":"2026-04-06T04:50:14","modified_gmt":"2026-04-05T20:50:14","slug":"meet-maxtoki-the-ai-that-predicts-how-your-cells-age-and-what-to-do-about-it","status":"publish","type":"post","link":"https:\/\/connectword.dpdns.org\/?p=674","title":{"rendered":"Meet MaxToki: The AI That Predicts How Your Cells Age \u2014 and What to Do About It"},"content":{"rendered":"<p>Most foundation models in biology have a fundamental blind spot: they see cells as frozen snapshots. Give a model a single-cell transcriptome \u2014 a readout of which genes are active in a cell at a given moment \u2014 and it can tell you a lot about what that cell is doing right now. What it can\u2019t tell you is where that cell is headed.<\/p>\n<p>That limitation matters enormously when studying aging. Age-related diseases like heart disease, Alzheimer\u2019s dementia, and pulmonary fibrosis don\u2019t happen overnight. They unfold across decades, driven by slow, progressive shifts in gene network states. To understand and eventually reverse these trajectories, you need a model that thinks in time \u2014 not just in snapshots.<\/p>\n<p>That\u2019s exactly what MaxToki is designed to do.<\/p>\n<h3 class=\"wp-block-heading\"><strong>What MaxToki Is, Under the Hood<\/strong><\/h3>\n<p>The team involved in this research includes researchers from institutions like the Gladstone Institute of Cardiovascular Disease, the Gladstone Institute of Data Science and Biotechnology, and the Gladstone Institute of Neurological Disease, all alongside the University of California San Francisco\u2019s Division of Cardiology, Biological and Medical Informatics Graduate Program, Department of Pathology, Department of Neurology and Bakar Aging Research Institute, Department of Pediatrics and Cardiovascular Research Institute, and Institute for Human Genetics. Also contributing were the University of California Berkeley\u2019s Department of Molecular and Cell Biology and NVIDIA along with the Institute of Cardiovascular Regeneration and Centre for Molecular Medicine at Goethe University Frankfurt, the German Center for Cardiovascular Research, the Cardiopulmonary Institute, and the Clinic for Cardiology at University Hospital Frankfurt from Germany, and the Center for iPS Cell Research and Application at Kyoto University. <strong>MaxToki<\/strong> is a transformer decoder model \u2014 the same architectural family behind large language models \u2014 but trained on single-cell RNA sequencing data. <strong>The model comes in two parameter sizes: 217 million and 1 billion parameters.<\/strong><\/p>\n<p>The key representational choice is the rank value encoding. Rather than feeding raw transcript counts into the model, each cell\u2019s transcriptome is represented as a ranked list of genes, ordered by their relative expression within that cell after scaling by expression across the entire pretraining corpus. This nonparametric approach deprioritizes ubiquitously expressed housekeeping genes and amplifies genes like transcription factors that have high dynamic range across distinct cell states \u2014 even when lowly expressed in absolute terms. It\u2019s also more robust against technical batch effects, since relative rankings within a cell are more stable than absolute count values.<\/p>\n<p>Training happened in two stages. <strong>Stage 1<\/strong> used Genecorpus-175M \u2014 approximately 175 million single-cell transcriptomes from publicly available data across a broad range of human tissues in health and disease, covering 10,795 datasets, generating approximately 290 billion tokens. Malignant cells and immortalized cell lines were excluded because their gain-of-function mutations would confound what the model learns about normal gene network dynamics, and no single tissue was permitted to compose more than 25% of the corpus. The model was trained with an autoregressive objective: given the preceding genes in the rank value encoding, predict the next ranked gene \u2014 conceptually identical to how language models predict the next token in a sentence.<\/p>\n<p>A key technical finding from Stage 1 is that model performance on the generative objective scaled as a power law with the number of parameters. This motivated the choice to fully pretrain exactly two variants \u2014 the 217M and 1B \u2014 rather than exploring the full spectrum, balancing performance against compute budget constraints.<\/p>\n<p><strong>Stage 2<\/strong> extended the context length from 4,096 to 16,384 tokens using RoPE (Rotary Positional Embeddings) scaling \u2014 a technique that interpolates more tokens into the existing positional framework by reducing the rotation frequency. This expanded context allowed the model to process multiple cells in sequence, enabling temporal reasoning across a trajectory rather than reasoning about one cell at a time. Stage 2 training used Genecorpus-Aging-22M: approximately 22 million single-cell transcriptomes across roughly 600 human cell types from about 3,800 donors representing every decade of life from birth to 90-plus years, balanced by gender (49% male, 51% female), generating approximately 650 billion tokens. Combined across both stages, MaxToki trained on nearly 1 trillion gene tokens in total.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1032\" height=\"1194\" data-attachment-id=\"78814\" data-permalink=\"https:\/\/www.marktechpost.com\/2026\/04\/05\/meet-maxtoki-the-ai-that-predicts-how-your-cells-age-and-what-to-do-about-it\/screenshot-2026-04-05-at-1-43-30-pm-2\/\" data-orig-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-05-at-1.43.30-PM-1.png\" data-orig-size=\"1032,1194\" data-comments-opened=\"1\" data-image-meta='{\"aperture\":\"0\",\"credit\":\"\",\"camera\":\"\",\"caption\":\"\",\"created_timestamp\":\"0\",\"copyright\":\"\",\"focal_length\":\"0\",\"iso\":\"0\",\"shutter_speed\":\"0\",\"title\":\"\",\"orientation\":\"0\"}' data-image-title=\"Screenshot 2026-04-05 at 1.43.30\u202fPM\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-05-at-1.43.30-PM-1-259x300.png\" data-large-file=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-05-at-1.43.30-PM-1-885x1024.png\" src=\"https:\/\/www.marktechpost.com\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-05-at-1.43.30-PM-1.png\" alt=\"\" class=\"wp-image-78814\" \/><figcaption class=\"wp-element-caption\">https:\/\/www.biorxiv.org\/content\/10.64898\/2026.03.30.715396v1.full.pdf<\/figcaption><\/figure>\n<\/div>\n<h3 class=\"wp-block-heading\"><strong>The Temporal Prompting Strategy<\/strong><\/h3>\n<p>The most architecturally novel contribution of MaxToki is its prompting strategy. A prompt consists of a context trajectory \u2014 two or three cell states plus the timelapses between them \u2014 followed by a query. The model then performs one of two tasks:<\/p>\n<p><strong>Task 1:<\/strong> Given a context trajectory and a query cell, predict the timelapse (in months) needed to reach that query cell from the last context cell.<\/p>\n<p><strong>Task 2:<\/strong> Given a context trajectory and a query timelapse, generate the transcriptome of the cell that would arise after that duration.<\/p>\n<p>For Task 1, a standard cross-entropy loss is insufficient because it treats each timelapse value as a disconnected category. Instead, the research team used continuous numerical tokenization with a mean-squared error (MSE) loss function, teaching the model that timelapses fall along a numerical continuum. This design choice produced dramatically lower prediction errors \u2014 the median prediction error for held-out ages dropped to 87 months with MaxToki, compared to 178 months for a linear SGDRegressor baseline and 180 months for the naive baseline of assuming each query cell was the most common age for that cell type and gender.<\/p>\n<p>Crucially, the model is never explicitly told which cell type or gender it\u2019s dealing with. It infers the trajectory context from the cells themselves \u2014 a form of in-context learning. This is why the model generalizes to held-out cell types it never saw during training: it achieves a Pearson correlation of 0.85 between predicted and ground truth timelapses on completely unseen cell type trajectories, and a Pearson correlation of 0.77 on held-out ages from held-out donors.<\/p>\n<h3 class=\"wp-block-heading\"><strong>GPU Engineering at Scale<\/strong><\/h3>\n<p>Training nearly 1 trillion gene tokens required serious infrastructure work. For the 1 billion parameter variant, the team implemented FlashAttention-2 via the NVIDIA BioNeMo stack built on NeMo, Megatron-LM, and Transformer Engine. To enable FlashAttention-2, they modified feed-forward hidden dimensions to be evenly divisible by the number of attention heads \u2014 a hard compatibility requirement. Combined with mixed-precision training using bf16, these changes yielded approximately a 5x improvement in training throughput and a 4x increase in achievable micro-batch size on H100 80GB GPUs. For inference, adopting the Megatron-Core DynamicInferenceContext abstraction with key-value caching resulted in over 400x faster autoregressive generation compared to the naive baseline.<\/p>\n<h3 class=\"wp-block-heading\"><strong>What the Model Learned \u2014 Without Being Told<\/strong><\/h3>\n<p>Interpretability analysis on the 217 million parameter variant revealed something striking: approximately half of the attention heads learned, entirely through self-supervised training with no gene function labels, to pay significantly higher attention to transcription factors compared to other genes. Transcription factors are master regulators of cell state transitions, but the model discovered their importance on its own.<\/p>\n<p>Ablation studies confirmed that both the context cells and the query cell are equally necessary for accurate predictions \u2014 masking either component significantly and equivalently degraded performance. Shuffling genes within the rank value encoding to produce \u201cbag of genes\u201d cells (preserving which genes are present but destroying their relative ordering) also significantly damaged predictions, demonstrating that the model learned to use the relative expression ordering of genes, not merely their presence or absence. Further attention analysis showed that individual heads specialized for different components of the prompt \u2014 some attending primarily to context cells, others to timelapse tokens, others to the query \u2014 with many heads exhibiting cell type-specific activation patterns across the roughly 60 cell types tested.<\/p>\n<p>One failure mode of generative models is learning to output averaged representations. The research team trained a doublet detector \u2014 a classifier distinguishing individual cells from simulated doublets formed by merging two cells of the same cell type \u2014 on ground truth cells, then applied it to MaxToki-generated cells. Approximately 95% of generated cells were classified as singlets, confirming that the model produces single-cell resolution transcriptomes rather than blended averages.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Inferring Age Acceleration in Disease \u2014 Including Diseases Never Seen During Training<\/strong><\/h3>\n<p>Given the model was trained only on healthy control donors, the research team tested whether it could infer aging signatures in disease states entirely absent from training. The approach: provide a context trajectory of normal cells, then query with a disease cell and test whether the model infers more or less elapsed time compared to an age-matched control cell.<\/p>\n<p>In lung mucosal epithelial cells from donors exposed to heavy smoking, the model inferred approximately 5 years of age acceleration compared to age-matched non-smoking controls \u2014 consistent with prior reports linking smoking status to telomere shortening and lung aging signatures. In lung fibroblasts from patients with pulmonary fibrosis \u2014 a disease characterized by telomere attrition and cellular senescence \u2014 the model inferred approximately 15 years of age acceleration.<\/p>\n<p>The Alzheimer\u2019s disease analysis produced several clinically important findings. In microglia from Alzheimer\u2019s patients drawn from the Mount Sinai NIH Neurobiobank, the model inferred approximately 3 years of age acceleration compared to age-matched controls. This result was replicated in an independent cohort from Duke and Johns Hopkins Alzheimer Disease Research Centers using homeostatic microglia specifically. Critically, this second cohort also included patients with mild cognitive impairment and Alzheimer-resilient patients \u2014 individuals who share the same neuropathological changes as Alzheimer\u2019s patients but exhibit no cognitive impairment. The model did not infer age acceleration in homeostatic microglia from either the mild cognitive impairment or resilient groups compared to controls, suggesting these patients may be protected from the disease-related age acceleration in this microglial subtype. This distinction between full Alzheimer\u2019s disease and Alzheimer resilience \u2014 captured without any disease-specific training \u2014 is one of the most clinically significant findings in the paper.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h3>\n<p>MaxToki represents a meaningful step forward in how AI models can reason about biological time. By moving beyond single-cell snapshots to model entire trajectories of gene network change across the human lifespan, it addresses a limitation that has constrained computational biology for years. The combination of rank value encoding, continuous numerical tokenization, RoPE-based context extension, and in-context learning allowed the model to generalize to unseen cell types, unseen ages, and even disease states it was never trained on \u2014 all while learning, without any supervision, to pay higher attention to the transcription factors that actually drive cell state transitions.<\/p>\n<p>What makes MaxToki particularly compelling for both researchers and engineers is that its predictions did not stop at the computational level. The model nominated novel pro-aging drivers in cardiac cell types that were subsequently validated to cause age-related gene network dysregulation in iPSC-derived cardiomyocytes and measurable cardiac dysfunction in living mice within six weeks \u2014 a direct line from in silico screening to in vivo consequence. With pretrained models and training code publicly available, MaxToki offers a reusable framework that the broader community can build on, fine-tune for specific disease contexts, and extend to new tissue types. As longitudinal single-cell datasets continue to grow, temporal foundation models like MaxToki may become a standard tool for identifying intervention points before age-related diseases take hold.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<p>Check out\u00a0the\u00a0<strong><a href=\"https:\/\/www.biorxiv.org\/content\/10.64898\/2026.03.30.715396v1.full.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Paper<\/a>, <a href=\"https:\/\/huggingface.co\/theodoris-lab\/MaxToki\" target=\"_blank\" rel=\"noreferrer noopener\">Model<\/a> <\/strong>and<strong> <a href=\"https:\/\/github.com\/NVIDIA-Digital-Bio\/MaxToki\" target=\"_blank\" rel=\"noreferrer noopener\">Repo<\/a>. \u00a0<\/strong>Also,\u00a0feel free to follow us on\u00a0<strong><a href=\"https:\/\/x.com\/intent\/follow?screen_name=marktechpost\" target=\"_blank\" rel=\"noreferrer noopener\"><mark>Twitter<\/mark><\/a><\/strong>\u00a0and don\u2019t forget to join our\u00a0<strong><a href=\"https:\/\/www.reddit.com\/r\/machinelearningnews\/\" target=\"_blank\" rel=\"noreferrer noopener\">120k+ ML SubReddit<\/a><\/strong>\u00a0and Subscribe to\u00a0<strong><a href=\"https:\/\/www.aidevsignals.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">our Newsletter<\/a><\/strong>. Wait! are you on telegram?\u00a0<strong><a href=\"https:\/\/t.me\/machinelearningresearchnews\" target=\"_blank\" rel=\"noreferrer noopener\">now you can join us on telegram as well.<\/a><\/strong><\/p>\n<p>Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?\u00a0<strong><a href=\"https:\/\/forms.gle\/MTNLpmJtsFA3VRVd9\" target=\"_blank\" rel=\"noreferrer noopener\">Connect with us<\/a><\/strong><\/p>\n<p>The post <a href=\"https:\/\/www.marktechpost.com\/2026\/04\/05\/meet-maxtoki-the-ai-that-predicts-how-your-cells-age-and-what-to-do-about-it\/\">Meet MaxToki: The AI That Predicts How Your Cells Age \u2014 and What to Do About It<\/a> appeared first on <a href=\"https:\/\/www.marktechpost.com\/\">MarkTechPost<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Most foundation models in biol&hellip;<\/p>\n","protected":false},"author":1,"featured_media":675,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-674","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/674","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=674"}],"version-history":[{"count":0,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/posts\/674\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=\/wp\/v2\/media\/675"}],"wp:attachment":[{"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=674"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=674"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/connectword.dpdns.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=674"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}