Application:Motion capture,UAV,Exoskeleton robot,Humanoid robot
In 2024, robots have taken a big step towards becoming more human like. The MobileALOHA, a dual arm robot that can cook and do household chores developed by a team from Stanford University and Google DeepMind, Tesla's humanoid robot that can perform complex operations such as folding clothes, and the emergence of Sora have sparked people's imagination about the era of universal robots. How do robots perform complex humanoid operations? After ChatGPT caused a wave, multimodal embodied VLMs represented by Google PaLM-E continued to emerge, and the robotics field began to adopt more new technologies such as motion capture, as well as many different AI training methods. A recent research report by an institution pointed out that in order to make robots more human like, there are currently five mainstream exploration paths in research and industry, namely virtual simulation, teleoperation, imitation learning, VLM+small models, and VLA. Among these mainstream studies, motion capture technology, as a new technology, has begun to receive increasing attention.
New breakthroughs in the field of drones
Drones are a typical research field extensively applied to motion capture technology. Due to the need for each drone to have high-precision and high stability spatial position information when collaborating with multiple drones to complete complex tasks. Many cutting-edge studies on drones are using optical motion capture to promote collaborative control of robot clusters. Through optical motion capture technology, it is very easy to achieve real-time tracking and control of multiple drones, thus better realizing cluster collaborative operations and intelligent control. For example, an academic paper titled "Dual UAV Collaborative Suspension Transport Path Planning Based on Artificial Potential Field A * Algorithm" was published in the interdisciplinary journal "Knowledge Based Systems" in the field of artificial intelligence at Shanghai University. By using the infrared optical motion capture system product under Qingtong Vision Company, a unique dual quadcopter collaborative transport robot APF-A * algorithm was studied and validated, which can efficiently plan safe and reliable paths in complex multi obstacle environments.
In this study, Professor Rao Jinjun's research team from the School of Mechanical and Electrical Engineering and Automation at Shanghai University found that compared to traditional drones, quadcopter drones have lower fault tolerance. Therefore, it is necessary to explore the safety of their trajectory planning. The team then obtained real drone product data through optical motion capture products and used the virtual simulation platform AirSim to build a virtual simulation environment. Through reinforcement learning, the intelligent agent continuously tried and made mistakes in the virtual environment, helping the drone learn the optimal behavior strategy and ultimately map it to reality, helping the drone change formation in a timely manner and achieve safe and effective collaborative air transportation. This study will help improve the obstacle avoidance and emergency response capabilities of aircraft, and enhance transportation safety.
As one of the few optical motion capture brands in China that can be commercialized on a large scale, CHINGMU's products can collect three-dimensional spatial trajectory coordinates, providing low latency, strong stability, and easy-to-use precise 6DOF tracking, and providing users with key performance parameters. For example, in the development of unmanned vehicles or drones, developers can quickly incorporate SLAM algorithm validation. Third party servers can enter POE switches through wireless networks and combine with motion capture servers to achieve more intuitive end-to-end interaction and validation.
Exoskeleton robots become a new landing field
Of course, besides drones, optimizing motion and posture through motion capture technology is currently a hot research topic. Because motion capture can accurately replicate the actions of real people executing commands to robots through sensors and other devices, the input information and actions are accurate and targeted. This fusion of remote operation technology can quickly achieve human avatars.
For example, in the past, Mobile ALOHA relied on the data collected by the system to learn and imitate operational skills and body control through human demonstration movements. Its hardware configuration is not high, with a complete set price of only $32000, and the solution is open source. The above costs and configurations can create a Mobile ALOHA that can achieve many functions in a home environment, and the key lies in the robot's motion control, interaction, and autonomous movement. Not only that, Tencent RoboticsX also has a motion capture approach based on real dogs in the field of quadruped robots.
In the past, teleoperation technology mainly used specialized equipment to directly transfer human actions to robots, and then the robots collected data through sensors to achieve bidirectional feedback. But this requires a very clear object to be operated on, otherwise the success rate in the real environment cannot be guaranteed. Nowadays, with the rapid development of large models, using motion capture technology, not only can more detailed motion data be obtained, but also richer environmental data can be obtained. Considering that robots are more innovative in engineering applications, the combination of motion capture and remote operation is undoubtedly more direct and effective in practical applications, which can solve the needs of real scenes, especially suitable for the use of humanoid robots and similar products.
For example, the School of Intelligent Manufacturing at Jianghan University has adopted the CHINGMU motion capture system to provide technical, equipment, and full process service support for the research of exoskeleton robots. With the help of a 3D motion capture system, research can accurately and completely obtain the state characteristics of human lower limb joints in various motion scenarios, providing reliable reference and comparative data for testing experiments. Based on data support, it is beneficial to optimize the motion control strategy of exoskeleton robots, simulate and restore the posture of human motion more realistically, and provide strong support for breaking through technical difficulties, improving experimental accuracy, and achieving simulation control optimization of lower limb exoskeleton robots, creating more possibilities.
Due to CHINGMU's visual motion capture technology, three-dimensional data analysis and recording can be performed to accurately identify and record the motion states of different objects. Provide technical support for lower limb movement trajectories that are difficult to visually record with the naked eye. It can help to smoothly collect important parameters such as joint angle, velocity, acceleration, etc., and assist in the experimental collection of lower limb joint motion characteristics of the human body.
Humanoid robots create demand
The most popular humanoid robots currently exhibit decision-making and execution abilities that are gradually approaching past expectations, which can also be supported by motion capture technology.
For example, Musk's previous video of folding clothes largely relied on similar technology. Because when Tesla showcased the progress of Optimus at its 2023 shareholders' meeting, the engineers in the video demonstrated wearing devices. If you observe carefully, you can see that the human grasping action is actually accurately recognized by AI algorithms and replicated on the robot. Due to its high resolution, high speed, and high precision, optical motion capture technology can provide more accurate spatial position information for humanoid robots, which greatly helps to improve the positioning accuracy and stability of robots, enabling them to better complete various tasks in complex environments.
Currently, CHINGMU's optical motion capture products can achieve a spatial positioning accuracy of 0.1mm, an angle accuracy of 0.1 °, a jitter error of only 0.01mm, a large field of view, a long tracking distance, and few blind spots. They also support active and passive marking points, can adapt to various occasions, and can support a capture area of thousands of square meters. They can track hundreds of targets in real time, support multiple protocols, and are compatible with all mainstream software.
Throughout history, human expectations for humanoid robots have been that they need to possess structures and mobility similar to humans in order to better adapt to human living and working environments. Through optical motion capture technology, researchers can capture and analyze real-time motion data of humanoid robots, thereby improving the structure and motion control algorithms of robots from a biomimetic perspective, enhancing their flexibility and speed.
If we go further, as humanoid robots are important tools for interacting and collaborating with humans, they will definitely need to have high intelligence and autonomy in the future. Imitation learning from human demonstrations is a promising path for training robots to master skills in the real world, with strong generalization and transferability, especially suitable for daily life scenarios.
Through optical motion capture technology, researchers can even obtain motion data and posture information of robots in actual environments, and use this data to optimize robot behavior and decision-making, improving the accuracy and efficiency of human-computer interaction. For example, "MimicPlay: Long Horizon Imitation Learning by Watching Human Play" (Chen Wang et al.) mentioned that the MimicPlay framework used by Stanford's Mobile ALOHA actually learns latent action logic from human interaction data to guide low-level visual motion control based on minimal teleoperation demonstration training, ultimately achieving excellent performance in task success rate, generalization ability, and robustness to interference, and significantly improving the success rate of task execution.
Due to the physical characteristics of the robot itself, many actions and paths that can be achieved on humans require further cleaning and optimization in order to truly restore them to robots. Although there are still some bottlenecks in applying human real data captured by motion capture to robots, it is still a very promising direction.
Conclusion and Future
Many studies around the world have attempted to combine new technologies such as large models with robots. However, when directly interacting and sampling robots or robotic arms in real environments, there are generally problems such as low sampling efficiency and safety issues with random sampling. Especially for the research of high cost humanoid robots, if there is excessive rotation angle or irreversible damage caused by collisions during obstacle avoidance tasks, it is unbearable pain.
Training reinforcement learning algorithms in simulation simulators through optical motion capture systems can undoubtedly solve the above problems well. Moreover, based on powerful optical motion capture products, complex environmental content containing visual elements can be generated to form large-scale simulation datasets, enriching the environmental and item details information in robot samples, thereby reducing data barriers in segmented scenes and minimizing potential issues such as time delays and illusions that may occur during the process of mapping and transferring samples to real robots.
At present, for example, CHINGMU, whose core technology originates from the Beijing Institute of Automation, Chinese Academy of Sciences, has independently developed and controlled its software and hardware. It has already developed and maturely released multiple new cameras and solutions. The UW series, D series, K series, R series cameras, etc. are all very suitable for cutting-edge research in universities, especially its high-end optical camera K (Kunpeng) series, which was newly released in January 2024, has achieved significant breakthroughs in resolution, frame rate, and capture distance, making it very suitable for research in related fields such as humanoid robots.
It is reported that as a leading enterprise in the field of optical motion capture, CHINGMU's core product MC series optical cameras have been applied in various industries, and the company's multiple products can also be deeply customized to meet personalized needs, making it a reliable and excellent choice for future robot manufacturers and researchers.