03/12 2026
567

"Is this truly autonomous, or is it being remotely controlled?"
On March 10 (local time), following the release of a demonstration video by Figure AI showcasing the Figure 03 humanoid robot performing household chores, Elon Musk's query on social media immediately stripped away the "veneer" of the humanoid robot sector, centering the online discourse on a pivotal debate: Was this seemingly flawless housework demonstration a genuine leap forward in embodied AI, or merely an elaborately staged "remote-controlled act"?

We meticulously analyzed the video frame by frame, incorporating industry "insider knowledge," in an attempt to unearth the truth concealed within the footage:
First, let's revisit the demonstration that sparked skepticism: In the two-and-a-half-minute video, Figure 03 navigates an open-concept living room, adeptly grasping a spray bottle to sanitize surfaces, wiping down the table with a cloth, relocating a cup, tidying up toys, turning off the TV, and then meticulously straightening the remote control. It even exhibits "discriminatory treatment" of objects: delicately picking up a cup while casually tossing a sofa cushion back into place. The entire sequence unfolds seamlessly, mirroring the rhythm of human housekeeping.
Figure AI swiftly responded: The entire process was fully autonomous, driven by the innovative Helix 02 neural network system without any remote intervention. The robot perceives its surroundings through onboard cameras, with its "universal brain" making independent decisions and orchestrating full-body movements. No additional programming is necessary—it can acquire new housework skills simply by incorporating training data.
However, Musk's skepticism was far from unfounded—the robotics industry has long harbored a tacit understanding: Many ostensibly impressive demonstration videos are, in fact, the result of "remote control." For instance, Shenzhen Dobot's Atom robot can be remotely manipulated by engineers using VR devices and ultra-low-latency transmission to cook steak, with response times so swift that they are virtually indistinguishable from autonomous operation to the untrained eye. Can Figure 03's demonstration truly evade "remote control suspicions"?
A frame-by-frame examination uncovers three pivotal details that hint at autonomy versus remote control, each deserving of close scrutiny.
Detail 1: The "Continuity Gap" in Movements—Seamless at first glance, yet marked by "delay indicators." In the video, after wiping the table, Figure 03 drapes the towel over its shoulder to free its hands for a storage box. While the transition appears natural, slow-motion playback reveals a subtle 0.5-second lag between the towel's arc and the timing of the storage box grab. For fully autonomous decision-making, the robot's movements should be "instantaneously synchronized"—after all, the Helix 02 system underscores "end-to-end full-process control." Yet, with remote control, delays are inevitable as operators observe through cameras before issuing commands, leaving faint traces even with millisecond-level transmission.
Detail 2: The "Consistency Discrepancy" in Object Recognition—Inconsistent handling of similar items. In the video, Figure 03 accurately differentiates between cups and toys but exhibits "grip force variations" when manipulating two similar building blocks: one is gently placed in the storage box, while the other is firmly pressed down. For an autonomously learning AI, the handling logic for similar objects should remain consistent, as its core strength lies in a "universal model encompassing all actions." With remote control, however, inconsistencies in operators' hand force control are inevitable, particularly during extended sessions—a classic hallmark of remote operation.
Detail 3: The "Overly Adapted Scene"—Absence of unexpected events, suggesting a "prearranged scenario." The entire demonstration setting appears excessively tidy, devoid of cluttered objects or sudden obstacles in the living room, and even the lighting is uniformly distributed, completely lacking the randomness of a real home environment. The core challenge of household scenarios is "uncertainty," yet Figure 03's actions perfectly align with the scene without any hesitation or adjustments. Industry standards suggest that remote-controlled demonstrations often prearrange scenes and object positions to prevent operational errors due to environmental complexity. A truly autonomous robot, however, should demonstrate real-time adjustments to unexpected events (e.g., an object falling)—something entirely absent from the video.
Of course, these details remain "plausible speculations" without definitive proof. After all, the Helix 02 system powering Figure 03 does possess technical merit—trained on synthetic data from platforms like NVIDIA Isaac Lab, it can swiftly master complex movements. Its predecessor, Figure 02, has already accumulated 1,250 hours of autonomous operation on production lines, showcasing some autonomous capabilities.
In reality, the crux of this controversy has never been whether Figure 03 "cheated," but rather a "trust crisis" plaguing the entire embodied AI sector.
Figure AI founder Brett Adcock emphasizes "full autonomy," but genuine trust is not built through video demonstrations alone but through verifiable "operating hours"—like Figure 02's 1,250-hour continuous factory operation, which resonates more profoundly than any demo. Remote control may sustain short-term demonstrations but cannot support long-term operation; preset scenes may conceal technical flaws but cannot withstand real-world complexity.
Musk's skepticism serves as a wake-up call for the entire industry: The essence of embodied AI lies in "autonomous decision-making and environmental adaptation," not "feigning autonomy through remote control." If the industry remains fixated on "fake demonstrations," it will erode not just investor confidence but the future of the entire sector.
As of now, Figure AI has not released continuous operation data for Figure 03 nor permitted third-party verification. We might maintain a rational optimism: If it truly achieves full autonomy, it would signify a major breakthrough in embodied AI. But if it is merely a "remote-controlled performance," it will inevitably be supplanted by the industry and market.
After all, genuine technological advancements never rely on "flawless videos" for packaging. Time and real-world applications are the ultimate litmus tests. This controversy over "autonomy versus remote control" will also propel the humanoid robotics industry from "unbridled growth" toward "standardized development."