AI Trends8 min

Meta's Superintelligence Research Institute Unveils Its First AI Model, Muse Spark

Meta's Superintelligence Research Institute (MSRI) has released its first model, Muse Spark. It features two reasoning modes — Thinking and Contemplating — along with multimodal capabilities, and comes with benchmark results targeting GPT-4o and Claude 3.7.

April 2026 · AI News

Meta officially launched the Superintelligence Research Lab. It's a separate organization from the existing Meta AI team. The first model is called Muse Spark. It's an entirely different line from the Llama series.

Llama is an open-source public model. Anyone can download it and run it locally. Muse Spark is different. It's closed API only. This is the first time Meta has chosen to run both open-source and closed models simultaneously.

What stands out in this release is the reasoning mode design. Two modes were built into a single model. Thinking is fast reasoning, Contemplating is deep reasoning. A single API parameter switches between them. No need to run two separate models.

Quick Summary
— Meta Superintelligence Research Lab's first public model: Muse Spark
— Two reasoning modes: Thinking (fast) / Contemplating (deep)
— Text + image multimodal (no image generation)
— Separate closed model from Llama
— Accessible via Meta AI app, API, and Meta.ai web
Table of Contents
  1. Meta Superintelligence Research Lab — Why a Separate Organization
  2. Thinking vs Contemplating — Two Reasoning Modes
  3. API Call Structure — reasoning_mode Parameter
  4. Multimodal — Image Understanding Only, No Generation
  5. Key Model Comparison
  6. Use Case Recommendations — When to Use Which Mode
  7. Relationship with Llama — Dual Strategy
  8. Access Methods — Meta AI App, Web, and API
  9. Frequently Asked Questions

Meta Superintelligence Research Lab — Why a Separate Organization

Meta already had an AI research team. FAIR (Facebook AI Research) was the main body. The Llama series came out of that team. The newly launched Superintelligence Research Lab has a different goal. AGI is explicitly on the table.

The competitive landscape makes the context clear. OpenAI has o3, Anthropic has Claude 3.7, Google has Gemini 2.5 Pro. All are closed, high-performance reasoning models. Meta was the only one missing from this market. Muse Spark is the first attempt to fill that gap.

FAIR continues to handle foundational research and open source. The Superintelligence Research Lab takes on closed, high-performance model development. Both teams run in parallel. Meta hasn't abandoned its open-source strategy — it's added another layer on top of it.

Thinking vs Contemplating — Two Reasoning Modes

Thinking is the fast reasoning mode. It's used for tasks where response speed matters. Coding assistance, document summarization, and general Q&A fall here. Most everyday tasks are handled well enough by this mode.

Contemplating is different. It's like a teacher working through a math problem step by step in a notebook. The entire intermediate reasoning process is included in the response. It's suited for mathematical proofs, scientific calculations, and complex logical reasoning. Responses are slower, but you can check what judgment was made at each step.

The key is that both modes live inside a single model. A single API parameter switches between them. This is different from how OpenAI runs GPT-4o and o3 as separate models. No need to change model names in the codebase — just modify the parameter.

API Call Structure — reasoning_mode Parameter

The API call format is similar to the OpenAI-compatible structure. There's not much to change in existing codebases. The only difference is the reasoning_mode parameter. Just set it to either "thinking" or "contemplating".

# Thinking mode — fast reasoning
{
  "model": "muse-spark",
  "reasoning_mode": "thinking",
  "messages": [{"role": "user", "content": "..."}]
}

# Contemplating mode — deep reasoning, includes think block
{
  "model": "muse-spark",
  "reasoning_mode": "contemplating",
  "messages": [{"role": "user", "content": "..."}]
}

Contemplating mode responses have a different structure. The intermediate reasoning process goes inside a <think> block. The final answer appears separately below it. The two sections need to be parsed separately. A UI that expands and shows the reasoning process is possible.

API Access Status (as of April 2026)
API key issuance currently runs on a waitlist basis. Meta AI Pro subscribers get priority access. General developer access will open up gradually. The endpoint is accessed via ai.meta.com.

Multimodal — Image Understanding Only, No Generation

Text and images are processed together. You can paste a chart image and request analysis, or drop in an error screenshot for debugging. Handwritten formulas are recognized too. Images are passed to the API alongside text.

The limitations of this release are clear. Only image understanding is supported. No image generation, no voice input, no video processing. Compared to Gemini 2.5 Pro, which handles text, image, voice, and video all at once, the scope is narrow. This is a first version — that needs to be factored in.

Llama started as text-only too. Multimodal versions were added later. Muse Spark will likely follow the same path. It's too early to write off today's limitations as permanent.

Key Model Comparison

The reasoning model market is broadly split three ways. The separate model approach (OpenAI), the unified model approach (Anthropic, Meta), and the all-in-one approach (Google). Each has different operational complexity and cost structures.

Muse Spark is structurally similar to Claude 3.7 Sonnet. Reasoning depth is adjusted within a single model. The difference is that the two modes are clearly separated by name. The intent is obvious just from the parameter names.

Model Reasoning Approach Multimodal Open Source
Muse Spark Thinking / Contemplating (unified) Text + Image X
Claude 3.7 Sonnet Extended Thinking (unified) Text + Image X
GPT-4o o1 / o3 (separate models) Text + Image + Voice X
Gemini 2.5 Pro Thinking (unified) Text + Image + Voice + Video X
Llama 3.3 None Some versions O

Use Case Recommendations — When to Use Which Mode

Don't use Contemplating for everything. Response speed slows down and costs go up. For tasks where fast turnaround matters — coding assistance or document summarization — Thinking is enough. The criterion for choosing a mode is simple. If you need to inspect the reasoning process, use Contemplating. If you just need the result, use Thinking.

On the flip side, using only Thinking for tasks that need precise reasoning leads to lower accuracy. Mathematical proofs, logical structure analysis, and multi-step planning call for Contemplating. You need to be able to trace intermediate steps to catch errors.

Task Type Recommended Mode Reason
Coding assistance / Code review Thinking Fast response, sufficient context
Document summarization / Translation Thinking Speed matters, step-by-step reasoning not needed
Mathematical proofs / Formula verification Contemplating Step-by-step reasoning required
Complex bug debugging Contemplating Tracing reasoning process is useful
Image analysis (charts/screenshots) Thinking Basic multimodal processing
Multi-step planning / Logical structure design Contemplating Intermediate judgment process verification needed

Relationship with Llama — Dual Strategy

Llama is open source. Local deployment and fine-tuning are unrestricted. Costs are low and data stays in-house. These advantages aren't replaced by Muse Spark. The two models target different use contexts.

Muse Spark is for when high-performance reasoning is needed. Complex multi-step reasoning, multimodal processing, and building production API-based services fall here. These are tasks that local Llama can't handle well. The two aren't competing. Their roles are divided.

There's a practical combination. Draft generation is handled by the local Llama model. Review and precise reasoning get handed off to the Muse Spark API. It's a structure that balances cost and performance. There's no reason to pick just one. Connecting them in a pipeline is the most practical approach right now.

Llama + Muse Spark Combination Pattern
Step 1 — Draft, classification, preprocessing: Llama local (no cost, no data leaving the system)
Step 2 — Precise reasoning, verification, multimodal: Muse Spark API (Contemplating mode)
Step 3 — Where fast responses are needed: Muse Spark API (Thinking mode)
Result: Reduce overall API costs while deploying high-performance models only where performance matters

Access Methods — Meta AI App, Web, and API

There are three paths to access Muse Spark. The Meta AI app, Meta.ai web, and API. Each path has different suitable users and use cases. If you're not a developer, the app or web is enough.

The API is for developers only. All parameters, including reasoning_mode, are directly controlled. It's used for connecting to production services or setting up custom pipelines. Currently run on a waitlist basis.

Path Features Recommended For
Meta AI App iOS/Android, immediate use, login only General users, mobile-focused
Meta.ai Web Browser access, no installation needed Desktop users, quick testing
API Direct parameter control, production integration Developers, service building

Frequently Asked Questions

Q. Is Muse Spark a different model from Llama?

They're separate lines. Llama is an open-source public model from Meta FAIR, and Muse Spark is a closed API model from the Superintelligence Research Lab. The operating organizations are different, and so are the access methods. Llama development continues on its own track.

Q. What's the difference between Thinking mode and Contemplating mode?

Thinking is the fast reasoning mode. Response speed comes first, and only the result is returned without intermediate steps. Contemplating thinks longer and deeper. It's suited for tasks that need step-by-step reasoning like math and science problems, and the response includes a <think> block.

Q. Where can I use Muse Spark?

It can be accessed through three paths: the Meta AI app (iOS/Android), Meta.ai web, and API. General users will find the app or web easier. Developers can directly control reasoning_mode through the API. The API currently runs on a waitlist basis.

Q. Is image generation supported?

Not in this release. Only image understanding and analysis are supported. Chart interpretation, screenshot debugging, and handwritten formula recognition are possible. Image generation, voice, and video are not included. They'll likely be added in future releases.

Q. Can I use it alongside Llama?

Yes. It's actually recommended. The practical approach is to split — tasks where local processing and cost matter go to Llama, tasks that need high-performance reasoning go to Muse Spark API. Connecting the two models in a pipeline keeps costs down while maintaining performance.

Meta chose a structure that runs open-source and closed simultaneously. It's not pushing just one direction. Llama keeps going open source. Muse Spark is the high-performance closed line. The two strategies run in parallel without conflict.

Muse Spark is still a first version. Multimodal stops at image understanding. No voice, no video, no image generation. It's not a model with everything right now. Still, the fact that Meta built a separate organization targeting AGI and released a first model is itself a signal. There's now a reason to watch the next release.

Official Sources
— Meta AI Blog: meta.ai/blog
— Meta Superintelligence Research Lab official announcement
— Muse Spark API Documentation: ai.meta.com/docs

This article was written based on publicly announced materials from April 2026. Specifications may change in future releases. Check official documentation for the latest information.