Making AI More Intuitive Through Brain-Computer Interfaces

Interactive Reinforcement Learning for Autonomous AI Systems – Learning from Human Perceptive Responses via a Brain-Computer Interface.

Challenge

Autonomous AI systems, such as robots, smart home devices, or self-driving cars, offer support in a variety of situations. Through machine learning, these systems can increasingly operate independently. It is crucial that they adapt quickly, respond promptly, and function flawlessly. To achieve this, Reinforcement Learning (RL) is often used, where correct behaviour is rewarded, and errors are penalized. However, conventional RL algorithms are often costly, time-consuming, and data-intensive, requiring extensive feedback for accurate assessment and learning. Moreover, the training of AI systems is usually carried out in isolation, without active human involvement. When human feedback is obtained, it often occurs in a cumbersome manner through speech or gesture interaction, making the training feel unnatural and may require frequent interruptions. 

Methodology

A promising approach is » learning from human feedback«. In this method, a person can give direct feedback to the agent during the learning process. Our project aims to develop a novel interactive RL method that incorporates human feedback through a Brain-Computer Interface (BCI), based on implicitly measured brain signals, such as the so-called error-related potential (ErrP). The ErrP occurs when people perceive something incorrect or incongruent and can be measured using electroencephalography (EEG). This allows humans to quickly and directly recognize and assess errors made by the robot agent without additional effort.

Results

In our initial study, we compared various EEG devices (gel-based and dry electrodes) and machine learning methods to assess how well they could differentiate between optimal and suboptimal behaviors of robots (classification of measured ErrP in a person's EEG). We contrasted classical feature engineering-based approaches with classifications based on Riemannian geometry and Convolutional Neural Network (CNN). Our results showed that the CNN-based model performed the best, regardless of the number of sensors or the use of gel. The number of sensors also did not play a significant role in the accuracy of the results. In a second empirical feasibility study, we tested a proof-of-concept demonstrator in a physically realistic simulation environment. Here, we compared the novel implicit BCI-driven RL training approach with explicit human feedback. Our findings indicate a significant acceleration and improvement in learning performance through the use of BCIs, compared to traditional methods based on sparse rewards (without human feedback). Furthermore, our results demonstrate that the performance of our BCI-based approach is even comparable to that achieved with conventional explicit feedback. Our method highlights the added value of combining BCI and AI technologies for more intuitive and efficient robot training, which could not only reduce training times but also accelerate the development of adaptive and empathetic machines. This is particularly valuable in scenarios where explicit, cognitively demanding feedback is not available.