Artificial intelligence chatbots capable of adjusting to a user’s accent or utilizing advanced keyboards with continuous updates to enhance predictive capabilities can be empowered by personalized deep learning models. This adaptation requires an ongoing refinement of machine learning models with new data.
Typically, user data is uploaded to cloud servers for updates since smartphones and other devices have limited storage and processing capacities for this fine-tuning process. However, the security risk of transmitting sensitive customer data to the cloud arises due to the high energy consumption involved in data transmission.
Researchers from MIT, the IBM Watson AI Lab, and other institutions have developed a technique that enables deep-learning models to adapt to new sensor data directly on devices. Their on-device learning method, known as PockEngine, selectively stores and processes specific items to identify which components of a large machine-learning model require updates to enhance accuracy. By conducting most computations during the design preparation phase, it minimizes computational overhead and expedites the fine-tuning process.
Compared to other methods, PockEngine significantly accelerates on-device training, improving efficiency by up to 15 days on certain platforms without compromising model accuracy. The researchers also observed that a popular AI chatbot demonstrated more precise responses to challenging queries as a result of their fine-tuning technique.
While on-device fine-tuning presents challenges, such as privacy concerns and resource limitations, it offers benefits like cost reduction, customization, and support for lifelong learning. Song Han, an associate professor in the Department of Electrical Engineering and Computer Science (EECS) at MIT and a member of the MIT-IBM Watson AI Lab, emphasized the importance of managing both training and inference on border devices. The research team, including lead author Ligeng Zhu and collaborators from MIT, IBM Watson AI Lab, and the University of California San Diego, presented their findings at the IEEE/ACM International Symposium on Microarchitecture.
In deep learning models, which consist of interconnected layers of neurons processing data to make predictions, backpropagation is used during training and fine-tuning to update layers based on the disparity between the output and the correct response. PockEngine optimizes the fine-tuning process by fine-tuning individual layers for specific tasks, identifying the portions that require adjustment to balance accuracy and tuning costs effectively.
By simplifying the model during deployment preparation and optimizing the computational process, PockEngine streamlines on-device training, reducing memory requirements and enhancing efficiency. The researchers aim to further enhance large-scale models capable of handling text and image data simultaneously using PockEngine in future applications.