Challenge
Autonomous AI systems, such as robots, smart home devices, or self-driving cars, offer support in a variety of situations. Through machine learning, these systems can increasingly operate independently. It is crucial that they adapt quickly, respond promptly, and function flawlessly. To achieve this, Reinforcement Learning (RL) is often used, where correct behaviour is rewarded, and errors are penalized. However, conventional RL algorithms are often costly, time-consuming, and data-intensive, requiring extensive feedback for accurate assessment and learning. Moreover, the training of AI systems is usually carried out in isolation, without active human involvement. When human feedback is obtained, it often occurs in a cumbersome manner through speech or gesture interaction, making the training feel unnatural and may require frequent interruptions.