2 min read
#Google images#Web scraping#CLI

Scrapix CLI

A command line interface for scraping Google Images to build small, customizable computer vision datasets

Introduction

Creating datasets for computer vision tasks often requires time-consuming image collection from multiple sources.

Scrapix CLI simplifies the process by scraping Google images for specified keywords, allowing developers and researchers to quickly gather and organize small computer vision datasets tailored to their project’s needs.

Key Features

  1. Keyword-based scraping: Gathers images based on specified keywords and search terms directly from Google Images.
  2. Structured Keyword Input: Accepts JSON/YAML files with keywords and image counts for precise dataset customization.
  3. Automatic Organization: Saves images into folders named after keywords for effortless dataset preparation.

Technologies Used

  • Node.js
  • TypeScript
  • Inquirer - an NPM package for building interactice command line user interfaces

Potential Use-Cases

  • Collecting images tailored to specific research needs or domains.
  • Building small datasets for initial experiments in computer vision tasks.