Vision Project

Overview

The Vision Project is a comprehensive computer vision application designed to assist visually impaired individuals by leveraging advanced technologies for object detection, optical character recognition (OCR), and face recognition. Idea by secondary stage students Janaabdelfatah and Romaysa, this project was part of their participation in the Genius Olympiad.

Video Demonstration

Version Working on Raspberry Pi

Check out this video demonstration of the Vision Project in action:

Version Working on Laptop

Features

Version 1: Website (Using Laptop)

Technologies Used:
- Flask framework for backend.
- HTML, CSS, and JavaScript for frontend.
- Python for core functionality.
Functionalities:
- Object Detection: Uses YOLOv5 model from Hugging Face.
- Face Recognition: Uses Haar Cascade Classifier.
- Optical Character Recognition (OCR): Uses Tesseract OCR.
- Text-to-Speech (TTS): Converts detected text to speech using Google TTS.

Version 2: Raspberry Pi

Technologies Used:
- Python for core functionality.
- Raspberry Pi 4 with 8GB RAM and Raspberry Pi Camera.
Functionalities:
- Object Detection: Uses YOLOv5 model.
- Face Detection: Due to computational power limitations, uses Haar Cascade Classifier.
- Optical Character Recognition (OCR): Uses Tesseract OCR.

Hardware Details

Raspberry Pi 4 Model B (8 GB)

Description:
- Raspberry Pi 4 Model B, Wi-Fi, 2x micro HDMI, USB-C, USB 3.0, 8 GB of RAM 1.5 GHz.
- The latest product in the Raspberry Pi range, offering improvements in processor speed, multimedia performance, memory, and connectivity.
Main Features:
- 64-bit quad-core processor.
- Dual display support with resolutions up to 4K.
- 8GB LPDDR4-2400 SDRAM.
- Dual-band 2.4/5.0 GHz wireless LAN, Bluetooth 5.0, Gigabit Ethernet.
- USB 3.0 and PoE capabilities (via a separate PoE HAT add-on).

Raspberry Pi Camera Board v1.3

Description:
- Plugs directly into the CSI connector on the Raspberry Pi.
- Delivers a 5MP resolution image or 1080p HD video recording at 30fps.

Installation and Setup

Version 1: Website (Using Laptop)

Clone the repository:

git clone https://github.com/Geo-y20/Vision-Project.git
cd Vision-Project

Install the required dependencies:
```
pip install -r vision.txt
```
Run the Flask application:
```
flask run
```

Version 2: Raspberry Pi

Clone the repository:

git clone https://github.com/Geo-y20/Vision-Project.git
cd Vision-Project

Ensure the Raspberry Pi environment is correctly set up with all necessary packages installed.
Run the scripts:
- camera.py: Check for camera functionality.
```
python camera.py
```
- facedetection.py: Perform face detection using the Haar Cascade Classifier.
```
python facedetection.py
```
- obj.py: Perform object detection using YOLOv5.
```
python obj.py
```
- ocr.py: Perform OCR using Tesseract.
```
python ocr.py
```

Scripts Explanation

camera.py:
- Checks if the Raspberry Pi camera is correctly set up and functional.
- Ensures the camera can capture images and video.
facedetection.py:
- Uses the Haar Cascade Classifier to detect faces in real-time.
- Captures video from the camera and applies the face detection algorithm.
obj.py:
- Uses YOLOv5 for real-time object detection.
- Captures video from the camera, processes it through the YOLOv5 model, and identifies objects.
ocr.py:
- Uses Tesseract to perform OCR on images captured by the camera.
- Converts the recognized text to speech using Google TTS.

Object Detection: YOLOv5

YOLOv5 (You Only Look Once) is used for real-time object detection. For more details on YOLOv5, visit the Roboflow blog and the COCO dataset.

Precision and Recall Equations

Precision: TP/TP+FP
- TP: True Positives
- FP: False Positives
Recall: TP/TP+FN
- TP: True Positives
- FN: False Negatives

Object Detection Performance with YOLOv5

Object	Precision (%)	Recall (%)	Processing Time (ms)
Person	98	97	20
Car	96	95	22
Bicycle	95	93	25
Dog	94	92	23
Cat	93	91	24

OCR: Tesseract

The Tesseract library is used for optical character recognition. For more information, refer to the Tesseract guide.

OCR Performance

Document Type	Precision (%)	Recall (%)	Processing Time (ms)
Invoice	95	94	150
Letter	93	92	140
Receipt	94	91	145
Book Page	92	90	155
ID Card	90	88	160

Face Recognition: Haar Cascade

The Haar Cascade Classifier is used for face detection and recognition. This method involves training a classifier using positive and negative samples and applying it to detect faces in images.

Face Recognition Performance

Person	Precision (%)	Recall (%)	Processing Time (ms)
Jana	98	97	100
Romaysa	97	96	105
Mariam	96	95	110
Mohamed	95	94	115
Youssef	94	93	120

Methodology

The Vision Project follows a systematic approach to ensure the highest performance and reliability:

Requirements Analysis:
- Understanding the needs of visually impaired users.
- Defining functional and non-functional requirements.
System Design:
- Creating a blueprint of the overall architecture.
- Using Flask framework for backend and HTML, CSS, JavaScript for frontend in the laptop version.
- Using Python for core functionality in the Raspberry Pi version.
Model Selection and Integration:
- Object Detection: YOLOv5
- OCR: Tesseract
- Face Recognition: Haar Cascade Classifier
Implementation:
- Developing the web application for the laptop version.
- Integrating the models for object detection, OCR, and face recognition.
Testing:
- Unit Testing: Testing individual components.
- Integration Testing: Ensuring all components work together.
- Performance Testing: Measuring response times and accuracy.
- User Testing: Gathering feedback from visually impaired users.
Evaluation:
- Analyzing performance metrics.
- Visualizing results using graphs and charts.

Additional Graphs and Charts

Confusion Matrix: For each task (Object Detection, OCR, Face Recognition), a confusion matrix shows the performance in terms of true positives, false positives, false negatives, and true negatives.
Precision-Recall Curve: Shows the trade-off between precision and recall for different threshold settings.
Receiver Operating Characteristic (ROC) Curve: Plots the true positive rate against the false positive rate for binary classification tasks.
F1 Score: Combines precision and recall into a single metric using the harmonic mean.
Accuracy Over Different Conditions: Compares accuracy under various conditions such as different lighting or image quality levels.

Contributors

This project was collaboratively developed by the following contributors:

George Youhana - georgeyouhana2@gmail.com
Mostafa Magdy - Mustafa.10770@stemredsea.moe.edu.eg
Abdallah Alkhouly - a.alkholy53@student.aast.edu
Mohamed Hany Sallam - m.h.sallam1@student.aast.edu

Janaabdelfatah and Romaysa, two girls in the secondary stage, competed in the Genius Olympiad with this project.

Access the Project

You can access the project files here: raspberry pi.rar

Contact

For any inquiries or further information, please contact the contributors via their provided email addresses.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Docuemnation of All Project		Docuemnation of All Project
samples for testing ocr		samples for testing ocr
static		static
templates		templates
Final-obj-ocr-face-project.mp4		Final-obj-ocr-face-project.mp4
First Trial.mp4		First Trial.mp4
README.md		README.md
capture.JPG		capture.JPG
collect_data.py		collect_data.py
dlib-19.24.1-cp311-cp311-win_amd64.whl		dlib-19.24.1-cp311-cp311-win_amd64.whl
face.py		face.py
raspberry pi.rar		raspberry pi.rar
test.py		test.py
thumbnail.JPG		thumbnail.JPG
vision.txt		vision.txt
yarb2.py		yarb2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision Project

Overview

Video Demonstration

Version Working on Raspberry Pi

Version Working on Laptop

Features

Version 1: Website (Using Laptop)

Version 2: Raspberry Pi

Hardware Details

Raspberry Pi 4 Model B (8 GB)

Raspberry Pi Camera Board v1.3

Installation and Setup

Version 1: Website (Using Laptop)

Version 2: Raspberry Pi

Scripts Explanation

Object Detection: YOLOv5

Precision and Recall Equations

Object Detection Performance with YOLOv5

OCR: Tesseract

OCR Performance

Face Recognition: Haar Cascade

Face Recognition Performance

Methodology

Additional Graphs and Charts

Contributors

Access the Project

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vision Project

Overview

Video Demonstration

Version Working on Raspberry Pi

Version Working on Laptop

Features

Version 1: Website (Using Laptop)

Version 2: Raspberry Pi

Hardware Details

Raspberry Pi 4 Model B (8 GB)

Raspberry Pi Camera Board v1.3

Installation and Setup

Version 1: Website (Using Laptop)

Version 2: Raspberry Pi

Scripts Explanation

Object Detection: YOLOv5

Precision and Recall Equations

Object Detection Performance with YOLOv5

OCR: Tesseract

OCR Performance

Face Recognition: Haar Cascade

Face Recognition Performance

Methodology

Additional Graphs and Charts

Contributors

Access the Project

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages