Latent Alpha — Geometric deep learning for industry

— 02 / GDL

A geometric view of the world.

Biology is geometric. States, trajectories, and structure all live on a manifold. LLMs flatten that into tokens; geometric deep learning preserves the structure data actually occupy. Both are powerful — only one respects the geometry the data is written in.

— Large language models

— Geometric deep learning

Data view

Sequences of tokens

Manifolds, graphs, point clouds

What's preserved

Word order, co-occurrence statistics

Geometry, symmetries, trajectories

Inductive bias

Sequence — permutation breaks it

Equivariance — rotations, reflections built-in

Training signal

Next-token prediction on massive corpora

Structure-aware objectives on scientific data

Failure mode

Hallucinates plausible tokens

Interpolates on the manifold — reviewable

Good for

Language, code, retrieval, agents

Single-cell, structure, trajectory, design

Big problems — and most production ML — isn't really a language problem, although some try to make it. It's a structure problem. When the data lives on a manifold (cells, markets, molecules, sensor fields), geometric methods are smaller, faster, more accurate, and easier to defend and interrogate. Both families are powerful — only one respects the geometry the data is written in.

— 03 / Capabilities

A small set of powerful primitives.

Every engagement assembles from the same core toolkit. We do a few things deeply rather than many things shallowly — and we publish what we learn.

— 001

Manifold learning

Diffusion-geometric embeddings (PHATE, MAGIC, MIOFlow) for high-dimensional data — preserving local and global structure where t-SNE and UMAP fail.

— 002

Generative flows

Score-based and flow-matching models conditioned on geometry. Sample from data distributions with explicit, controllable trajectories.

— 003

Equivariant networks

Architectures that bake symmetry into the model — translation, rotation, permutation — so they generalise from a fraction of the data.

— 004

Foundation models

Domain-specific pretraining at scale, with the geometric inductive biases that let a single model serve many downstream tasks reliably.

— 005

Causal inference

Counterfactual modelling on observational data — interventions, treatment effects, and confounder discovery, grounded in structure.

— 006

Interpretation & trust

Tools that let domain experts read what a model has actually learned — visualisations, attributions, and certified bounds. Models you can defend.

— 04 / Industries

Where the geometry matters.

We partner with teams whose data is high-dimensional, noisy, or relational — and where being right matters more than being fast.

— How GDL helps

Markets live on manifolds, not in flat factor spaces.

Returns, volatility, and cross-asset relationships sit on low-dimensional, curved structures that drift with regime. Geometric deep learning models that geometry directly — preserving local neighbourhoods, respecting symmetries, and exposing the topology of the risk surface rather than collapsing it to a few principal components.

AML & fraud detection with GNNs over transaction networks — surfacing rings, mule chains, and synthetic identities.
Regime detection from the geometry of return distributions, not just thresholds.
Non-linear factor models that recover structure missed by PCA and linear betas.
Portfolio embeddings for stress-testing exposure to correlated tail events.
Counterparty & flow graphs using GNNs over trade and settlement networks.

— How GDL helps

Biology is geometric by nature — molecules, manifolds, and trajectories.

The methods we publish were built for this domain: PHATE, MELD, MIOFlow, ImmunoStruct. Cells move along developmental manifolds; proteins fold in 3D; perturbations propagate through molecular graphs. Equivariant and manifold-aware models capture this without throwing away the structure that makes biology biology.

Single-cell trajectory inference on noisy, high-dimensional measurements.
Drug & small-molecule design with SE(3)-equivariant networks over molecular graphs.
Perturbation modelling — predicting cell-state response to interventions.
Multi-modal integration of genomics, imaging, and structural data.

— How GDL helps

Audiences are continuous, not buckets — and creative effects compound.

Identifier deprecation and platform fragmentation broke the lookalike era. Embedding audiences on a manifold — where similarity is geodesic, not categorical — produces targeting and lift estimates that survive cold start, sparse signal, and creative drift. Graph models handle attribution as a propagation problem, not a last-touch one.

Audience embeddings robust to deprecation and partial signal.
Creative-effect attribution via causal graphs over exposure and outcome.
Lift modelling with manifold-aware uplift estimators.
Cold-start & cross-platform generalisation through transferable geometry.

— How GDL helps

Data has shape — and most of it lives on graphs.

Records, lineages, and schemas form graphs; entities live in embedding spaces; quality issues cluster by topology. We bring graph intelligence to the modern data stack — ingesting broadly from Postgres, Snowflake, BigQuery, Kafka, S3, and native graph stores (Neo4j, TigerGraph, Neptune) — and treat the platform itself as a geometric object instead of a pile of tables.

Graph intelligence over enterprise knowledge graphs — entity, relationship, and event reasoning at scale.
Broad ingestion from Postgres, Snowflake, BigQuery, Kafka, S3, and native graph DBs (Neo4j, TigerGraph, Neptune).
Entity resolution & deduplication on learned embedding manifolds.
Lineage graphs with GNN-based impact and root-cause analysis.
Drift & quality monitoring via manifold distance, not just summary statistics.
Schema matching across heterogeneous sources at scale.

— How GDL helps

Grids, weather, and climate live on graphs and manifolds with hard physics.

Generic deep nets ignore conservation laws and symmetries that energy and climate systems obey by definition. Graph and equivariant architectures — combined with physics-informed losses — produce forecasts and control policies that respect the dynamics, generalise across topologies, and stay stable under distribution shift.

Grid-edge optimisation with GNNs over network topology.
Geospatial & weather inference on the sphere using equivariant models.
Physics-informed forecasting for load, generation, and storage.
Climate-risk modelling across coupled physical and economic systems.

— How GDL helps

State and action live on real, continuous, symmetric spaces.

Robots rotate, translate, and manipulate objects in 3D. Industrial processes have invariances and conservation laws that any sane model should bake in, not learn from scratch. Equivariant networks and Lie-group-aware policies need orders of magnitude less data and generalise across hardware, factories, and seasons.

Sensor fusion across LiDAR, vision, and IMU on common geometric structure.
Anomaly detection using manifold distance over multivariate time series.
Equivariant policies for manipulation that transfer across configurations.
Predictive maintenance via graphs of components and failure propagation.

— 05 / Demo

The S&P 500, on a manifold.

Six months of returns for the S&P 500, embedded into three dimensions and coloured by sector. Hover any ticker; rotate, pan, and zoom.

The geometry shows what flat factor models can't — sectors cluster, regimes separate, and outliers sit where they should. Same idea, applied to your data: portfolios, audiences, cells, sensors, customers.

sp500_6mo · interactive

— Loading chart

— 06 / Publications

The methods, in the literature.

Selected work from our team and collaborators that defines the toolkit we apply across industries. The full list lives on arXiv and Scholar.

— 2019

Visualizing structure and transitions in high-dimensional biological data (PHATE)

Nature Biotechnology

— 2022

Multiscale PHATE identifies multimodal signatures of COVID-19

Nature Biotechnology

— 2021

Quantifying the effect of experimental perturbations at single-cell resolution (MELD)

Nature Biotechnology

— 2025

ImmunoStruct enables multimodal deep learning for immunogenicity prediction

Nature Machine Intelligence

— 2023

Multi-view manifold learning of human brain-state trajectories

Nature Computational Science

— 2024

Mapping the gene space at single-cell resolution with gene signal pattern analysis

Nature Computational Science

— 2025

AAnet resolves a continuum of spatially localized cell states to unveil intratumoral heterogeneity

Cancer Discovery

— 2019

Exploring single-cell data with deep multitasking neural networks (SAUCIE)

Nature Methods

— 2023

Single-cell analysis reveals inflammatory interactions driving macular degeneration

Nature Communications

— 2023

MIOFlow: manifold interpolating optimal-transport flows for trajectory inference

NeurIPS

— 2025

Learnable filters for geometric scattering modules

Proceedings

— 2023

Multiscale geometric and topological analyses for characterizing immune responses from single-cell data

Trends in Immunology

View full publications list →

— 07 / Team

Built by the people who wrote the methods.

Latent Alpha spun out of the Krishnaswamy Lab at Yale. Our team authored the diffusion-geometry methods we now apply across industries.

Founder & CSO

Smita Krishnaswamy, PhD

Associate Professor of Computer Science and Genetics at Yale. PhD from University of Michigan EECS; postdoctoral training at Columbia systems biology. One of the leaders in geometric deep learning and graph neural network research.

Founder & CEO

David Kolb, MBA

Over 30 years of life-science experience: banker, founder, inventor, senior management, investor and EIR at Yale. Expertise across therapeutic areas, machine learning, geometric deep learning, and building great teams.

President

David Lewin, PhD

Twenty years managing technology, IP, and scientifically-based business alliances with biotech and pharma worldwide. Director of Business Development at Yale Ventures and EIR. Group Leader, Alliance Management & Biomarkers Discovery at CuraGen.

Advisor

Brian Gallagher, Jr., PhD

15+ years of biotech VC expertise (SR One, Abingworth, Trekk Venture Partners). Public and private board experience with 18+ biotechnology companies, including Nimbus Therapeutics. Deep scientific and operational background.

The shape of intelligence, in your data.

We make the hidden geometry of data useful.

A geometric view of the world.

A small set of powerful primitives.

Manifold learning

Generative flows

Equivariant networks

Foundation models

Causal inference

Interpretation & trust

Where the geometry matters.

The S&P 500, on a manifold.

The methods, in the literature.

Built by the people who wrote the methods.

Let's build a program together.

Thanks — we'll be in touch.