Deep Learning / Python

FROGS: Fine Resolution Optimization and Gradient Smoothing

December 16, 20234 min read
An astronaut frog in space
An image generated using the prompt: "An astronaut frog in space"

FROGS: Fine Resolution Optimization and Gradient Smoothing

Overview

Welcome to the FROGS project, an innovative approach to generating high-resolution images on memory-constrained devices. This project leverages pre-trained stable diffusion models and super-resolution deep neural networks (DNNs) to create a pipeline that generates high-quality images using minimal VRAM. By combining image post-processing techniques, FROGS aims to provide a solution for generating high-resolution backgrounds on devices with less than 8GB of VRAM.

Project Description

The motivation behind FROGS is to address the challenge of generating high-resolution images on devices with limited memory, without compromising on image quality. Existing solutions either require substantial memory or involve manual adjustments through external applications. FROGS offers a streamlined, on-device solution by integrating advanced deep learning models and optimization techniques.

Key Features

  • Pre-trained Models: Utilizes models such as Lykon’s Dreamshaper V7, Latent Consistency LORA model, and Latent Diffusion Model (LDM) for super-resolution.
  • Memory Optimization: Capable of generating high-resolution images with a fraction of the VRAM typically required.
  • Image Quality: Combines super-resolution and gradient smoothing to enhance image quality and reduce artifacts.

Methodology

Dataset

The project utilizes pre-trained models available on Hugging Face:

Image Generation Process

  1. Pre-trained Diffusion Model: Generates initial low-resolution images.
  2. Super-Resolution DNN Models: Enhances the resolution of generated images.
  3. Post-Processing Techniques: Applies gradient smoothing and alpha blending to improve image quality and remove seams.

Improvements

  • Memory Profile Optimization: Reduces the memory required for image generation, making it feasible on mobile devices.
  • Artifact Reduction: Enhances image quality by minimizing artifacts typically present in high-resolution images.
  • Parallel Processing: Implements techniques to generate images in a parallel fashion, improving efficiency and speed.

Seam Removal Process

In the image generation process, seams may appear due to the merging of image quadrants. To address this, we apply a technique called alpha blending to seamlessly blend the overlapping regions.

Example Images

  • Prompt: person smiling at the camera

  • Image with Visible Seams and Overlapping Edges: Image with Visible Seams and Overlapping Edges

  • Seam-Removed Image: Seam-Removed Image

Performance Metrics

  • Memory Usage:
    • Image generation: 1.5 GB
    • Super-resolution: 7.7 GB (can be reduced to ~2.0 GB with CPU offloading)
  • Time:
    • Image generation: ~7 seconds
    • Super-resolution: ~15 seconds

Results

FROGS successfully combines stable diffusion and super-resolution models into a single pipeline capable of running natively on memory-limited systems. The project demonstrates significant improvements in generating high-resolution images without sacrificing quality, even on devices with constrained resources.

Examples (See Slides Below)

  • Celestial Clockwork: A depiction of a celestial clockwork with planets and stars moving in harmony.
  • Japanese Cherry Blossoms: A serene Japanese garden in spring, with cherry blossoms in full bloom.
  • Winter Wonderland: A magical snowy landscape with a cozy cabin and a penguin.
  • Space Odyssey: A stunning view of a distant galaxy from the window of a spaceship, with planets and stars.
  • Aurora Over Mountains: Northern Lights dancing over snow-capped mountains under a starry sky.
  • Redwood Forest National Park: Photorealistic depiction of a redwood forest.

Future Work

  • Porting to MacOS/iOS: Adapt the program for Apple’s ecosystem.
  • User Interface: Develop a user-friendly interface for easier interaction.
  • Executable Compilation: Compile the program into a single executable for easy deployment.

References

For a detailed list of references, please refer to the complete research paper.

Paper

Stable DiffusionGenerative AIDeep LearningFrogs