Building a Virtual Try-On with Python, OpenCV, and Tkinter

Parneet Kaur Sandhu
6 min readNov 14, 2024

--

Ever wanted to try on different fashion accessories virtually, right from your device? This article walks you through creating a simple yet interactive virtual try-on using Python. Our app will allow users to “wear” accessories like glasses or hats, and we’ll use Tkinter to make it easy to navigate with buttons and a webcam feed for a live preview.

Virtual Try-On

Project Overview:

The goal of this project is to create a virtual try-on app using Python, integrating OpenCV for real-time video capture, Mediapipe for facial landmark detection, and Tkinter for a simple GUI interface. Users can select accessories, like glasses or hats, to overlay on their live webcam feed, with each accessory positioned accurately on their face. The project combines computer vision and GUI components, providing an interactive experience that allows users to see accessories in real-time, simulating a virtual fitting room experience.

How It Works:

To make this project interactive, I used OpenCV for video streaming, Mediapipe for detecting face and hand landmarks, and Tkinter to create a GUI interface.Here are steps in detail:

Step 1: Initializing the Application

The application begins by importing libraries:

Technical Setup:

  • OpenCV: For video capture and image processing.
  • Mediapipe: For detecting and tracking face and hand landmarks.
  • NumPy: For array operations, particularly with image data.
  • Tkinter: For creating a graphical interface to control accessory selection, brightness adjustment, and interaction modes.

Installation

The app requires Python 3.x and the libraries mentioned above, which can be installed with:

pip install opencv-python mediapipe numpy

The video feed is obtained from the webcam, setting up a continuous loop to analyze each frame in real-time.

Step 2: Face and Hand Landmark Detection

Using Mediapipe’s FaceMesh and Hands modules, the application detects key landmarks on the face and hands:

  • FaceMesh identifies up to 468 landmarks on the face, allowing precise placement of accessories (e.g., glasses) around the eyes.
  • Hands detects landmarks on the user’s hands, tracking gestures. For example, the distance between thumb and index finger on the right hand is used to control brightness, while a thumbs-up gesture with the left hand triggers a screenshot.

Step 3: Overlaying the Accessory Image

The selected accessory (e.g., glasses or a hat image) is stored in an assets folder. When a user chooses this file through the Tkinter GUI, OpenCV resizes it dynamically to fit the user’s face.

This process involves:

  • Calculating Scale and Position: The program calculates the distance between the eyes to resize and position the accessory accurately. This is done by referencing specific landmarks on the face detected by Mediapipe.
  • Overlaying with Transparency: Using OpenCV and NumPy, the accessory image is placed onto the video feed with transparency managed by the image’s alpha channel.

Step 4: Real-Time Brightness Adjustment

Brightness adjustment is controlled by the right hand. When the program detects the right hand, it calculates the distance between the thumb and index finger. This distance is mapped to a brightness level:

  • Shorter Distance: Decreases brightness.
  • Longer Distance: Increases brightness.

This adjustment is achieved by applying a scaling factor to the pixel values of the video feed.

Step 5: Capturing Screenshots with Hand Gestures

The left hand is monitored for a thumbs-up gesture:

  1. Gesture Detection: Mediapipe identifies when the left hand’s thumb is extended and upright while other fingers are down, signaling a thumbs-up.
  2. Screenshot Capture: When detected, the current frame is saved as an image file, timestamped for uniqueness, in the screenshots directory.

Step 6: User Interface with Tkinter

The Tkinter GUI allows users to control the experience without relying solely on gestures:

  • Accessory Selection: A button opens a file dialog, allowing users to select accessory images dynamically.
  • Brightness Toggle: A checkbox allows enabling/disabling brightness control, giving users the option to switch to gesture-free interaction.
  • Status Updates: Tkinter labels display updates (e.g., “Screenshot Captured”) to inform the user of ongoing actions, enhancing usability.

Final Step: Running and Exiting the Application

Simply run main.py, and the app opens the webcam, applies the virtual accessory, and activates gesture controls for brightness and screenshots. The following controls make the app interactive:

python main.py
  • Right Hand: Adjusts brightness by controlling the distance between the thumb and index finger.
  • Left Hand: Captures a screenshot when a thumbs-up gesture is detected.

Press q to exit the app.

Key Features:

  1. Virtual Accessory Placement: The app detects face landmarks (specifically around the eyes) using Mediapipe, then aligns the virtual accessory (like glasses) in real-time on the user’s face.
  2. Brightness Control: By moving your right hand closer or farther, the app dynamically adjusts the brightness of the video feed based on the distance between your thumb and index finger.
  3. Screenshot Capture: Capturing a photo is as simple as giving a thumbs-up with your left hand! The app detects the gesture, and your photo is saved with a timestamp.
  4. Customizable Accessories: Users can load their accessories (e.g., PNG images with transparency) into an assets folder, making it easy to experiment with different styles.

Heres come Interesting question: Why Tkinter?

Tkinter provides an accessible way to manage user interactions in Python, and it complements OpenCV well in applications needing minimal user interfaces. Here, Tkinter serves as a panel that allows users to:

  • Select Accessories: Through a simple button and file picker interface, users can browse and choose accessories (e.g., images of glasses or hats in .png format).
  • Preview Controls: The panel makes it easy to enable/disable features like brightness control and hand gestures, making it accessible to new users without needing keyboard shortcuts.

To integrate Tkinter into the Virtual Try-On , I used it to create a simple, user-friendly GUI. Tkinter is Python’s standard library for building graphical interfaces, allowing quick development of windows, buttons, labels, and other elements that make an app interactive and visually accessible.

Key Components in Tkinter

The core of the Tkinter implementation involves the following:

  1. Main Window: The root window holds the application’s control panel.
  2. Accessory Selection Button: A button triggers a file dialog, letting users load accessories dynamically from their computer.
  3. Brightness Control Toggle: A checkbox or slider provides options to enable/disable brightness adjustments, directly connected to the OpenCV processing function.
  4. Status Updates: Labels that update the user on the status of screenshot captures or accessory placements, making interactions intuitive and error-free.

Connecting Tkinter with OpenCV and Mediapipe

Tkinter controls are linked to the functions for placing accessories, adjusting brightness, and taking screenshots. For example:

  • The accessory selection button triggers an event that loads an image file, which is then processed by OpenCV to be overlaid on the user’s face.
  • The brightness toggle in Tkinter directly modifies a variable used in the Mediapipe loop for real-time brightness control.

Tkinter’s lightweight nature makes it ideal for applications like this one, where simplicity and responsiveness are key. If you want to extend the app’s capabilities, Tkinter can also support additional widgets, such as drop-down menus for selecting multiple accessories or sliders for more precise brightness adjustments.

Demo Snapshot:

Working Demo

Conclusion:

The app combines computer vision techniques with gesture control to provide an interactive virtual try-on experience. By using OpenCV, Mediapipe, and Tkinter, it enables real-time facial and hand tracking with a simple, accessible GUI for a fun, engaging user experience. The modular setup makes it easy to expand the application further, adding more accessories, customization options, or additional gestures as desired.

This journey was rewarding and a perfect mix of coding and creativity. If you want to give it a shot, check out my Github Repository Virtual-Try-On for all the code and setup instructions.

--

--

Parneet Kaur Sandhu
Parneet Kaur Sandhu

Written by Parneet Kaur Sandhu

I don't always predict the future, but when I do, I prefer to use data.

No responses yet