Computer Environments Elicit General Agentic Intelligence in LLMs

Daixuan Cheng^αβ, Shaohan Huang^β, Yuxian Gu^γ, Huatong Song^α, Guoxin Chen^α

Li Dong^β, Wayne Xin Zhao^α, Ji-Rong Wen^α, Furu Wei^β

^αGSAI, Renmin University of China ^βMicrosoft Research ^γTsinghua University

Agentic intelligence in large language models (LLMs) requires not only model intrinsic capabilities but also interactions with external environments. Equipping LLMs with computers now represents a prevailing trend. However, the computer environment's intrinsic value has not been systematically investigated, particularly its potential to elicit general capabilities. Here we introduce LLM-in-Sandbox, which virtualizes the computer as a code sandbox with only basic functionalities, and demonstrate that this minimal setting elicits computer-based meta-capabilities for general task solving: external resource access, file management, and code execution. Without additional training, strong models achieve substantial gains (up to 15.5%) across mathematics, physics, chemistry, biomedicine, long-context understanding, and instruction following, while reducing token consumption by up to 8 times. Furthermore, we develop LLM-in-Sandbox-RL to train models exclusively on non-agentic data within the sandbox, empowering weaker models to harness the environment and internalize these interactions. Our results demonstrate that computer environments elicit general intelligence, yield efficiency gains, and can be harnessed through training, serving as a promising foundation for generalist agents.

Demo Video

Watch LLM-in-Sandbox solve a chemistry problem: converting IUPAC names to SMILES notation

Task: Given a chemical compound's IUPAC name, identify the correct SMILES representation from multiple choices. The agent downloads PubChem package and uses it to convert the name to SMILES. Gold Answer: A

Example Gallery

🧪

Chemical Analysis

Convert IUPAC chemical names to SMILES notation

📥 None 📤 Answer

📐

Math Problem

Solve complex geometry problems with code assistance

📥 None 📤 Answer

📝

Instruction Following

Generate text following strict formatting constraints

📥 None 📤 Answer

📚

Long Context QA

Analyze multiple documents to answer complex questions

📥 Documents 📤 Answer

🗺️

Travel Planning

Create a 3-day Tokyo trip itinerary with interactive map

📥 None 📤 HTML Map

🎨

Poster Design

Design a promotional poster for a tech conference

📥 Event JSON 📤 SVG + PNG

🎬

Video Creation

Create a birthday countdown video with animations

📥 Theme JSON 📤 MP4 Video

🎵

Music Composition

Compose original ambient piano music with MIDI

📥 None 📤 MIDI + Audio

Citation

If you find our work helpful, please cite us:

@article{cheng2026llm,
  title={Llm-in-sandbox elicits general agentic intelligence},
  author={Cheng, Daixuan and Huang, Shaohan and Gu, Yuxian and Song, Huatong and Chen, Guoxin and Dong, Li and Zhao, Wayne Xin and Wen, Ji-Rong and Wei, Furu},
  journal={arXiv preprint arXiv:2601.16206},
  year={2026}
}

Computer Environments Elicit General Agentic Intelligence in LLMs

Quick Start

Demo Video

Results

Example Gallery

Chemical Analysis

Math Problem

Instruction Following

Long Context QA

Travel Planning

Poster Design

Video Creation

Music Composition

Citation

Computer Environments Elicit General Agentic Intelligence in LLMs

Quick Start

Demo Video

Results

Example Gallery

Chemical Analysis

Math Problem

Instruction Following

Long Context QA

Travel Planning

Poster Design

Video Creation

Music Composition

Demo Title

Citation