The present invention is a system and framework for augmenting any retail transaction system with information about the involved customers. This invention provides a method to combine the transaction data records and a customer or a group of customers with the automatically extracted demographic features (e.g., gender, age, and ethnicity), shopping group information, and behavioral information using computer vision algorithms. First, the system detects faces from face view, tracks them individually, and estimates poses of each of the tracked faces to normalize. These facial images are processed by the demographics classification module to determine and record the demographics feature vector. The system detects and tracks customers to analyze the dynamic behavior of the tracked customers so that their shopping group membership and checkout behavior can be recognized. Then the instances of faces and the instances of bodies can be matched and combined. Finally, the transaction data from the transaction data and the demographics, group, and checkout behavior data that belong to the same person or the same group of people are combined.