Article Preview
TopIntroduction
In current society, robots play an increasingly important role in our daily life, and their application scope and research fields are also expanding and developing (Fong et al., 2003). In recent years, deep learning technology has taken an important position in robotics research (Janiesch et al., 2021); indeed, its rise has brought new hope to robotics research. Deep learning is a neural network-based approach that enables machines to better process and analyze data, giving robots stronger perception and decision-making capabilities that enable them to better adapt to different environments and tasks. In the field of robotics research, deep learning has been widely used to improve the performance of robots (Heo et al., 2019). In particular, convolutional visual perception (Ran et al., 2021) and prediction models (Dasari, Erbert et al., 2019) have become one of the focal points of research. These models can effectively use convolutional neural networks (CNNs) to extract key information about the environment from sensor data, including object detection, motion estimation, and environment modeling, which are crucial for robot motion planning and provide valuable information input. The continuous development and innovation in robotics have enabled them to tackle challenges and tasks in a variety of different fields. Whether automating manufacturing on industrial production lines, performing precise operations in medical surgeries, or even performing missions to explore uncharted territories in space exploration (Biswal & Mohanty, 2021), robots play a key role and have become an integral part of modern society.
However, as robots continue to expand in various application areas, a series of challenges and problems have emerged. These problems include, but are not limited to, how to enable robots to better perceive and understand their surroundings (Rubio et al., 2019), how to make intelligent decisions in complex and unknown situations (Herrera-Viedma et al., 2020), and how to achieve more natural human-robot interaction (Andronas et al., 2021). In addressing these challenges, existing studies still have some limitations, although computer models such as deep learning have made some progress in relevant aspects. These limitations include the real-time requirements for motion planning (Castillo-Lopez et al., 2020), the realism requirements for environment sensing data (Martinez-Gonzalez et al., 2020), and the requirements for target detection and tracking accuracy (Wu et al., 2022). Therefore, the field of robotics still requires continuous research and innovation to meet the growing demands and overcome these challenges, to promote a greater role for robots in various fields, and to ensure that they are able to safely and efficiently interact with human society.
To overcome these shortcomings, we have devised the innovative MPC-WGAN-faster R-CNN network model. We strategically chose model predictive control (MPC), Wasserstein generative adversarial networks (WGAN), and faster region CNN (faster R-CNN) for their synergistic potential to significantly improve our models performance across various dimensions. We selected MPC for its exceptional precision in real-time trajectory planning and adaptability, which allows for the efficient forecasting and adjustment of robot actions in dynamic environments. Then, we integrated WGAN to refine the generation of realistic and synthetic visual data, thereby enhancing the model’s visual perception training processes, which enriches the robot’s interpretative capabilities. Finally, we adopted faster R-CNN because it ensures fast and reliable recognition of objects within the environment, providing essential information for nuanced motion planning.