Team teardown · Elorian AI

“BAIR, assemble.”

The senior researchers behind Gemini's data and Apple's foundation models, reunited for a final battle — against the reasoning gap.

Get the full visual brief as a PDF.

Who answered

20 frontier authors

The mission

Visual reasoning

The funding

$55M seed

Where it points

Physical-world AI

0.1 — The read

Twenty people. Six months. Not Gemini-adjacent — senior contributors to Gemini, PaLM 2 and Apple's foundation models.

This brief reads Elorian as a talent-flow story, in five questions — who they assembled, what they're after, who's writing the checks, where it points, and what the rest of us can steal.

Q	The question	The short answer
Q1	Who they assembled	A frontier-model program, rebuilt in miniature.
Q2	What they're after	The capability the big labs treat as a side quest.
Q3	Who's writing the checks	$55M, from the same network that built the team.
Q4	Where this points	No product moat yet. Only a talent moat.
Q5	What others can learn	The network doesn't transfer. The method does.

§ 01 — Question one of five

Who did they assemble?

Twenty people in six months — most with their names on the papers behind systems the industry now runs on.

1.1 — The standouts

Ten hires that set the bar.

Andrew Dai

CEO & Co-Founder

Co-wrote the 2015 pre-training paper the GPT line cites; co-led GLaM, PaLM 2 and Gemini's data org.

Dustin Tran

Chief Reasoning Architect

Joined from xAI — led post-training for Grok 4. Before that Gemini's eval lead; his Gemini-Exp-0801 hit #1 on LMSYS.

Yinfei Yang

Co-Founder · Chief Multimodal Architect

Core author across Apple's multimodal line — ALIGN, MM1, MM1.5, Ferret-UI.

Seth Neel

Co-Founder

Left a Harvard professorship to build this. Founded machine unlearning; earlier co-founded a company that raised $41M.

Le (Tycho) Xue

Founding Researcher

Technical lead of Salesforce's BLIP-3; first author of ULIP — 3D point-cloud reasoning. Driving roots at NVIDIA.

Richard Zhang

Founding Researcher

Gemini post-training and reward modeling at DeepMind; co-creator of Google Vizier.

Seojin Bang

Founding Researcher

A core architect on Gemini at DeepMind from 2022 to 2025. CMU PhD; the deep-inside-Gemini hire.

Forrest Huang

Founding Researcher

Berkeley BAIR PhD; core contributor to MM1.5, Ferret-UI and AFM at Apple, then Gemini-for-UI. 2,000+ citations.

Zhen Xu

Founding Member

Apple Intelligence search RLHF and agentic search; co-built MUFASA with Dai at Google Health.

Jihua Huang

Researcher

A decade at SRI on DARPA and Toyota vision programs — the one hire from outside the network.

1.2 — Authorship, not proximity

They didn't just use the last frontier. They helped build it.

Andrew Dai

LM pretraining '15 · GLaM · PaLM 2 · Gemini data

The pre-training + fine-tuning recipe the GPT papers cite, then Google's flagship data pipelines.

Yinfei Yang

ALIGN · MM1 · MM1.5 · Ferret-UI

Apple's multimodal foundation stack, paper by paper.

Dustin Tran

Edward · Gemini evals · Grok 4

The evaluation and reasoning layer of two different frontier labs.

Le (Tycho) Xue

BLIP-3 / xGen-MM · ULIP

Open multimodal models, and language aligned to 3D geometry.

Richard Zhang

Google Vizier · OptFormer

The optimization tooling a large slice of the field runs on.

Seth Neel

Descent-to-Delete

The paper that started machine unlearning as a discipline.

Roughly two dozen named systems, twenty people. Primary authorship at this density is the real moat.

1.3 — A model program, in miniature

Every layer already has a named owner.

Data & pretraining

Dai · Das

The base models and the data that feeds them.

Multimodal architecture

Yang · F. Huang · Xue

Vision-language and 3D understanding — the core of the thesis.

Post-training & reasoning

Tran · R. Zhang · Xu

RL, reward modeling and the reasoning layer.

Evaluation

Tran · Das

How you actually know the model got better.

Serving & infra

Kumar · Hu

Turning research models into something that runs — ex-AWS Bedrock, ex-Meta.

Company around the lab

Valentine · Jiang · Ling

First PM (ex-DeepMind), first GTM, first finance lead.

Most seed teams are a research core and a wish list. Elorian has a named owner for every layer — that's the anomaly.

1.4 — Hiring topology

A reunion, not a recruiting search.

The team didn't come off the open market. It grew outward from a few tight networks — three closed pools: Gemini alumni, Apple MM1 alumni, Berkeley BAIR. One open-market hire, deliberate, and exactly once.

Google · 9Apple · 3Amazon · 3Meta · 2ex-OpenAI / Anthropic · 0

Berkeley BAIRThe research core — co-authors and labmates

ii.

Gemini & Apple orbitsFrontier-lab colleagues, one hop out

iii.

Open marketThe one deliberate outside hire

Andrew DaiFounder · CEO

Forrest HuangDustin TranRichard ZhangYinfei YangSeth NeelSeojin BangZhen XuMarcella ValentineJihua Huang

1.5 — The connective tissue

Look past the employers, and it's one lab.

The résumés say Gemini and Apple. The thread underneath is a school: six of the team trace to Berkeley, and the research core to its AI lab, BAIR. The trust predates the paychecks.

Forrest Huang

Berkeley BAIR · PhD

Then Apple AFM and MM1.5, then Google. The vision-language line, carried out of the lab.

Richard Zhang

Berkeley · Applied-Math PhD

Then Gemini post-training and Vizier at DeepMind. Optimization, lab to frontier.

Dustin Tran

Berkeley · BS Math + Stats

Then Gemini evals, then Grok 4 post-training at xAI. Berkeley was the starting line.

Michelle Ling

Berkeley · CS + Haas

First finance and ops lead — the network reaches past the research bench, too.

Same pattern as our Eigen AI read: the real map is the lab and the co-authors, not the last logo.

1.6 — The differentiator

Where the thesis stops being a slogan.

A visual-reasoning pitch is cheap. Grounding it — geometry, sensors, deadlines — is not. Two hires make the difference.

Images → spaceLe (Tycho) XueFirst author of ULIP / ULIP-2, aligning language with 3D point clouds, with an autonomous-driving background at NVIDIA and Cadence. The bridge from flat images to geometry and physical constraint.

Research → the real worldJihua HuangTen years at SRI on applied vision for DARPA SemaFor, ARPA-H, and Toyota driver-monitoring. The one senior researcher hired off the open market — a lineage of AI that ships on deadlines.

They imported applied, physical-world computer vision on purpose — a team that hires against its own blind spot.

§ 02 — Question two of five

What are they after?

One capability the big labs treat as a side quest: native visual, spatial and physical reasoning — and the industries waiting on it.

2.1 — The thesis, in one line

A concentrated wager on the one capability the big labs treat as a side quest.

The big labs bolt vision onto language as a feature. Elorian's bet is the inverse: visual, spatial and physical reasoning as the foundation — built for robotics, aerospace, medicine and manufacturing.

“An elementary school kid can beat all the frontier models.”

Andrew Dai, CEO — on visual reasoning · The Neuron, June 2026

On the record, three times over: no model clears even BabyVision's six-year-old tier (it tests ages 3–12). His analogy: text AI is in the iPhone era; visual AI is a Nokia — “64×64 pixels,” ARC-AGI's actual resolution. Elorian pledges to publish its own evals.

2.2 — The method, now on record

Reasoning in the image — not about it.

Two July 2026 founder podcasts lay the technical bet out in public for the first time. Four commitments:

01 · Visual tracesPoint, trace, manipulateToday's models reason about images in text. Elorian's models act on them — pointing to count, tracing a path through a maze or a floor plan — the way a child uses a finger.

02 · One model, one spaceGenerate and edit, nativelyOne model producing text, images and video in a shared embedding space — his stated stepping stone to visual reasoning. The pre-/post-training split? “Arbitrary — the only distinction is scale.”

03 · Data firstDistributions, not examples“Garbage in, garbage out” holds at frontier scale. His Bayesian framing: data is a statistics problem, architecture an optimization problem — the muscles Dai's Gemini data org exercised.

04 · SequencingPost-train first, pre-train laterIt's “enough to post-train first” — early signs it lifts visual reasoning — with pre-training deliberately held back. “No need to rush”: the full-stack bench was built for that moment.

This validates the hiring read: the data and post-training owners weren't incidental hires — they are the method.

2.3 — The lane

Everyone's chasing the physical world. Elorian wants the layer beneath it.

3D / world models

World Labs · Fei-Fei Li

Dai's on-record contrast: world models tackle the visual modality alone — robotics, entertainment — where Elorian combines modalities seamlessly in one foundation model.

Robot foundation models

Physical Intelligence · Pi

General-purpose robot policies — embodiment-first, where Elorian stays a layer up, model-side.

Humanoid robotics

Figure AI · Hardware + AI

Owns the robot itself — the most applied, most capital-heavy end of the same thesis.

The incumbents

DeepMind · OpenAI · Meta

Treat vision as a feature bolted onto language. Dai, on record: Gemini leads, Anthropic trails on MMMU / OCR-class benchmarks — and all of them fail his board-game test.

NEA already brackets Elorian with World Labs, Physical Intelligence, Sakana and CuspAI as the “neolab” cohort. His edge claim within it: a specialist can drop the trivia for smaller, cheaper models that win one capability.

§ 03 — Question three of five

Who's writing the checks?

$55M at seed, and a cap table drawn from the same closed research network that built the team.

3.1 — The cap table

The money knows the thesis.

$55M at roughly $300M, April 2026. The signal isn't the number — it's who.

Board · lead

Brian Zhan · Striker Ventures

A physical-AI investor whose portfolio runs through Skild AI, Periodic Labs, Voyage AI. His robotics thesis maps directly onto Elorian's.

Co-leads

Menlo · Altimeter

Altimeter brings a public-markets, growth lens to the cap table — unusually early for a seed.

Strategic

NVIDIA · Jeff Dean

Silicon on the cap table, and a personal angel check from the person who built Google Brain.

Angels

Sharon Zhou + others

An AI-native angel bench layered over the institutional leads — relationships more than capital.

Research relationships over growth capital — the same network that built the team.

§ 04 — Question four of five

What future does this point to?

No product yet, a crowded lane — and a first contract that will define the company.

4.1 — The scoreboard

What they've launched: a thesis, and a bench.

No public model, API, paper or demo. There's no product moat yet — only a talent moat. Which is exactly why this is a talent brief: for now, the team is the company.

When	Milestone
Feb 2026	Founded — weeks after Gemini 3 shipped, the classic post-release exit window. Dai “and some friends” from Apple and DeepMind; in-person by design.
Apr 2026	Out of stealth. $55M seed at ~$300M; framed as the first lab built around native visual reasoning.
Jun 2026	Dustin Tran joins as Chief Reasoning Architect — ex-Grok 4 post-training lead.
CVPR '26	Co-hosted a “reasoning gap in visual AI” dinner with NVIDIA, NEA, Twelve Labs, GMI Cloud.
Jul 2026	On record: post-training-first strategy; internal wins over Gemini Flash on visual benchmarks; public model late 2026 — plus a pledge to release their own evals.
Ongoing	Aggressive hiring; capital earmarked for compute, team, and early customer pilots.

4.2 — The first contract

Our guess: the first big check reads technical drawings.

Dai has named the early lanes himself — and the first model is due late 2026. Four candidates, ranked:

Engineering & technical drawings

Named first on the podcast

Diagrams and technical drawings defeat object detection. His sharpest tell: models “can't even tell what two things a wire is connected to” — said pointedly about data-center buildout.

Architecture & design

Floor plans, twice over

Count the meeting rooms, doors, windows; trace a wheelchair path for code compliance. Design iteration over spatial constraints no text-first model can hold.

Video understanding

Industries on legacy CV

Stock tracking, monitoring — buyers running traditional computer vision today, not building their own models. He explicitly won't sell to labs that train models.

Robotics OEM

Via the cap table (NVIDIA)

His example: a robot at a factory control panel that reasons to pull the safety lever first. NVIDIA is an investor — still the biggest prize, but downstream of the first model.

Satellite appears as wildfire detection, not defense. Factory automation folds into engineering. Long-run he still wants a broad model: “the only way to recoup the compute.”

4.3 — Staying power

How they plan to survive being easy to leave.

APIs cut both ways — easy to adopt, easy to churn off. Asked directly, Dai gave a three-step answer.

The wedge

Marquee names, few verticals

“A few key verticals and a few marquee enterprises” — relationship-led proof the model wins on real visual-reasoning work, not benchmarks.

The opening

Incumbents weakest here

Where current models fail outright, there's no loyalty to churn against — “less incentive to stick with an API” if a rival is priced right and accurate.

The moat

API → tools → ecosystem

The retention plan is explicitly the Anthropic route: platforms and tooling around the API once the first model ships.

Recruiting signal: expect platform, DX and forward-deployed roles to open within two quarters of the model launch — a different hiring market than the research bench.

§ 05 — Question five of five

What can the rest of us learn?

You can't copy the network — it took a decade to earn. But the method underneath it transfers.

5.1 — The lessons

Five moves worth stealing.

01 · ScreeningHire authors, not alumni“Worked on Gemini” describes thousands of people. Named authorship describes a handful. Screen for the citation, not the logo on the last job.

02 · SourcingRecruit the trust graphThe fastest search isn't a search. Map co-authors, labs and shared codebases — people who've already built together align in weeks, not quarters.

03 · SequencingName an owner per layerBy hire twenty, every layer of a model program had a named owner. Hire against the org you'll need, not the résumés you happen to like.

04 · The exceptionBuy your blind spotThe one open-market hire was the skill the network lacked. Know precisely which hire your friends can't supply — and go outside for exactly that one.

05 · The pitchSell the problem, not the packageA Harvard chair and a Grok 4 lead didn't move for comp. They moved for one unsolved problem — and full-stack scope: at big labs “if you work in post-training, you don't look at pre-training.”

The catchReunions must be earned.This playbook assumes a decade of relationships to mine. Without one, the map has to be built deliberately — which is a craft of its own.

5.2 — The culture transplant

He isn't just hiring the people. He's rebuilding the room.

“Google Brain was the Bell Labs of this era.”

Andrew Dai — on the diaspora behind OpenAI, Anthropic, SSI… and Elorian

01 · OsmosisIn-person, deliberatelyResearch taste transfers by proximity — corridor talk, micro-kitchen debates, early wrong results. Elorian is in-office by design to recreate it.

02 · ScreeningThe residency filterBrain's residency screened for unusual backgrounds and intense curiosity, not GPA — and produced a generation of founders. Expect the same filter here.

03 · SafetyComfortable being wrongShow results early, wrong ones included; tell anyone “this isn't the right direction.” The conviction on top is Sutskever's old line: “success is guaranteed.”

Recruiting signal: his own Brain interns — Fedus, Ha, Gu — now run their own labs. The Brain diaspora is both Elorian's talent pool and its competition.

Base to Base · Recruiting

The strongest early AI teams aren't hired off the open market. They're reassembled from a trust graph — co-authorship, shared labs, and a decade of working together.

Read that graph and you can see where a company's value is forming before the market narrative catches up.

— The takeaway

The team is the moat —
and it was assembled
one relationship at a time.

Brief

Elorian AI · Talent Brief

Prepared by

Base to Base · Recruiting

Methodology & limitations

How this read was assembled.

— Sources

Public LinkedIn profiles and work history.
Google Scholar, citations, and published research.
Funding disclosures and cap-table reporting.
Founder interviews and July 2026 podcasts.

— Limitations

Team of ~20; a few roles remain approximate.
This is not a complete org chart.
The goal is a pattern-level read on how the team was formed — not a roster. Teardowns like this are how our searches begin; this one's on us.

“BAIR, assemble.”

Who did they assemble?

Ten hires that set the bar.

They didn't just use the last frontier. They helped build it.

Every layer already has a named owner.

A reunion, not a recruiting search.

Look past the employers, and it's one lab.

Where the thesis stops being a slogan.

What are they after?

A concentrated wager on the one capability the big labs treat as a side quest.

Reasoning in the image — not about it.

Everyone's chasing the physical world. Elorian wants the layer beneath it.

Who's writing the checks?

The money knows the thesis.

What future does this point to?

What they've launched: a thesis, and a bench.

Our guess: the first big check reads technical drawings.

How they plan to survive being easy to leave.

What can the rest of us learn?

Five moves worth stealing.

He isn't just hiring the people. He's rebuilding the room.

The team is the moat —and it was assembledone relationship at a time.

How this read was assembled.

The team is the moat —
and it was assembled
one relationship at a time.