2 min read
#OCR#Computer Vision

INEC OCR

A proof of concept application for extracting and verifying results from the INEC statement of result forms to mitigate electoral fraud

Introduction

INEC-OCR is a system that leverages Optical Character Recognition (OCR) technology for accurately extracting and verifying results from INEC statement of result forms to detect and prevent electoral fraud.

Key Features

  • Automated Data Extraction: Designed and implemented an OCR pipeline to extract structured data (e.g., candidate names, vote counts, and polling unit details) from scanned images of INEC result forms.
  • Data Validation: Integrated cross-verification mechanisms to detect discrepancies by comparing extracted data with official results from INEC.
  • Fraud Detection: I also tried to incorporate algorithms to flag tampered forms by analyzing inconsistencies in text and handwriting patterns. This is still a work in progress.
  • User-Friendly Interface: Developed an intuitive web page for extracting the results, visualizing the extracted results, and generating downloadable reports.

Technologies Used

  • Python
  • FastAPI
  • Machine Learning: for implementing the agglomerative clustering algorithm
  • OCR Engine: PaddleOCR
  • Docker
  • k8s

Potential Use-Cases

  • Enhancing transparency and accountability in the electoral processes
  • Minimizing the risk of manual errors and manipulation in INEC statement of result forms