Large Model–Empowered Technical Architecture and Application Study for Intelligent Interaction and Autonomous Decision-Making in Humanoid Robots
DOI:
https://doi.org/10.65196/zxmm8035Keywords:
Large-scale models; Humanoid robots; Multimodal interaction; Autonomous decision-making; Task planning; Edge–cloud collaboration; Trustworthy safetyAbstract
With the rapid advancement of artificial intelligence, large-scale models have demonstrated notable strengths in natural language understanding, multimodal fusion, knowledge reasoning, and generative interaction, providing a new technical foundation for robotic systems to evolve from “function-oriented automation” toward “general-purpose intelligent agents.” As a key direction in intelligent equipment, humanoid robots—owing to their human-like morphology and compatibility with human environments—show broad application potential in intelligent manufacturing, public services, healthcare and eldercare, and emergency rescue. However, conventional humanoid-robot systems still fall short in semantic understanding, complex task planning, cross-scenario generalization, and safe controllability in open environments, making it difficult to meet the practical demands of highly dynamic and uncertain real-world settings. In response, this paper focuses on the key issues of large model–empowered intelligent interaction and autonomous decision-making in humanoid robots. It systematically analyzes the roles of large models in multimodal perception fusion, semantic understanding, intent recognition, task decomposition, policy generation, and behavioral execution, proposes an engineering-oriented overall technical architecture, and further discusses implementation pathways for edge–cloud collaborative deployment, real-time control constraints, trustworthy safety governance, and typical application scenarios. The study suggests that the core of large model–driven upgrading of humanoid-robot intelligence lies in building a closed-loop framework of “multimodal perception–semantic understanding–knowledge reasoning–task planning–execution feedback,” while enhancing system reliability and controllability through constraint mechanisms, alignment strategies, and safety controls. This work provides a reference for the industrial deployment of humanoid robots and the engineering design of intelligent-agent systems.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Journal of science and technology exploration

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.