| Abstract [eng] |
Virtual reality (VR) technologies are becoming increasingly popular, with new hardware releases, more advanced algorithms being used, increase in VR-enabled systems and a broader range of VR usage in various fields. Hand tracking is one of the most important technologies in VR systems when creating an immersive user interface. Meta Quest 2 virtual reality system supports hand tracking using only headsets with no additional controllers but encounters problems: hand tracking is impaired if part of hand is occluded during movements, additional devices (e.g. haptic gloves) are worn on hands or when lighting conditions are sub-optimal. These issues cause loss of tracking. The main aim of this research is to propose artificial intelligence (AI) methods for hand joint spatial data predictions using virtual reality systems when tracking is lost and evaluate their performance. The focus is on short-term predictions to compensate for common loss of tracking, thus increasing the continuity and immersion of VR applications that use hand tracking. Recurrent neural networks (RNN) are used, including long short-term memory (LSTM) and gated recurrent unit (GRU) networks with modifications. The dataset is constructed and visualized using Unity with OpenXR package. Additional research is done to find performant bi-directional communication between AI methods and VR subsystems – pythonnet package is used. Hand joint position, rotation, linear and angular velocity of last fifteen frames is used to make predictions of next joint spatial data frame in case tracking is lost. Provided experimentation suggests that optimal combination of parameters is able to achieve an average Euclidean position prediction error of 1.11 cm and an average angular error of 8.04 degrees when predicting a single frame into the future. Separate model suggests that rotation can be predicted with average angular error of 5.2 degrees. Multi-frame prediction experiments show that autoregressive training improves long-term prediction accuracy. Average Euclidean positions errors of 1.47 cm, 1.62 cm, 3.36 cm and 10.65 cm were achieved for 1, 5, 10 and 30 frame prediction into the future respectively. The results show that the system is capable of improving VR hand tracking without additional hardware or loss of performance in real time applications. |