The process of labelling or tagging data used to train supervised machine learning models. Annotation provides the “ground truth” that helps models learn the correct patterns and relationships for tasks like classification, object detection, or sentiment analysis.
Data annotation transforms raw data into training examples by adding labels that represent the correct outputs or interpretations. The quality, consistency, and representativeness of annotations significantly impact model performance. Annotation may be performed by human annotators, automated tools, or a combination of both. For specialised domains like healthcare or finance, subject matter experts are often required to provide accurate annotations.
A retail company annotating thousands of customer service interactions with labels indicating customer sentiment, issue type, and resolution outcome to train an AI system that can automatically categorise and prioritise incoming support requests.