FROGS: Fine Resolution Optimization and Gradient Smoothing

Overview

Welcome to the FROGS project, an innovative approach to generating high-resolution images on memory-constrained devices. This project leverages pre-trained stable diffusion models and super-resolution deep neural networks (DNNs) to create a pipeline that generates high-quality images using minimal VRAM. By combining image post-processing techniques, FROGS aims to provide a solution for generating high-resolution backgrounds on devices with less than 8GB of VRAM.

Project Description

The motivation behind FROGS is to address the challenge of generating high-resolution images on devices with limited memory, without compromising on image quality. Existing solutions either require substantial memory or involve manual adjustments through external applications. FROGS offers a streamlined, on-device solution by integrating advanced deep learning models and optimization techniques.

Key Features

Pre-trained Models: Utilizes models such as Lykon’s Dreamshaper V7, Latent Consistency LORA model, and Latent Diffusion Model (LDM) for super-resolution.
Memory Optimization: Capable of generating high-resolution images with a fraction of the VRAM typically required.
Image Quality: Combines super-resolution and gradient smoothing to enhance image quality and reduce artifacts.

Methodology

Dataset

The project utilizes pre-trained models available on Hugging Face:

Lykon’s Dreamshaper V7: Lykon/dreamshaper-7 · Hugging Face
Latent Consistency Model (LCM) LoRA: SDv1-5: latent-consistency/lcm-lora-sdv1-5 · Hugging Face
Latent Diffusion Model (LDM) for super-resolution: CompVis/ldm-super-resolution-4x-openimages · Hugging Face

Image Generation Process

Pre-trained Diffusion Model: Generates initial low-resolution images.
Super-Resolution DNN Models: Enhances the resolution of generated images.
Post-Processing Techniques: Applies gradient smoothing and alpha blending to improve image quality and remove seams.

Improvements

Memory Profile Optimization: Reduces the memory required for image generation, making it feasible on mobile devices.
Artifact Reduction: Enhances image quality by minimizing artifacts typically present in high-resolution images.
Parallel Processing: Implements techniques to generate images in a parallel fashion, improving efficiency and speed.

Seam Removal Process

In the image generation process, seams may appear due to the merging of image quadrants. To address this, we apply a technique called alpha blending to seamlessly blend the overlapping regions.

Example Images

Prompt: person smiling at the camera
Image with Visible Seams and Overlapping Edges:
Seam-Removed Image:

Performance Metrics

Memory Usage:
- Image generation: 1.5 GB
- Super-resolution: 7.7 GB (can be reduced to ~2.0 GB with CPU offloading)
Time:
- Image generation: ~7 seconds
- Super-resolution: ~15 seconds

Results

FROGS successfully combines stable diffusion and super-resolution models into a single pipeline capable of running natively on memory-limited systems. The project demonstrates significant improvements in generating high-resolution images without sacrificing quality, even on devices with constrained resources.

Examples (See Slides Below)

Celestial Clockwork: A depiction of a celestial clockwork with planets and stars moving in harmony.
Japanese Cherry Blossoms: A serene Japanese garden in spring, with cherry blossoms in full bloom.
Winter Wonderland: A magical snowy landscape with a cozy cabin and a penguin.
Space Odyssey: A stunning view of a distant galaxy from the window of a spaceship, with planets and stars.
Aurora Over Mountains: Northern Lights dancing over snow-capped mountains under a starry sky.
Redwood Forest National Park: Photorealistic depiction of a redwood forest.

Future Work

Porting to MacOS/iOS: Adapt the program for Apple’s ecosystem.
User Interface: Develop a user-friendly interface for easier interaction.
Executable Compilation: Compile the program into a single executable for easy deployment.

References

For a detailed list of references, please refer to the complete research paper.

Paper

Stable DiffusionGenerative AIDeep LearningFrogs