Getting Started

Welcome to the official technical documentation for Manga OCR Mobile.

Usage

Before inference, images must be preprocessed to match the model's input expectations:

Canvas Setup: Create a 224x224 canvas filled with white background.
Scaling: Scale the input image to fit within the 224x224 box while preserving its aspect ratio.
Centering: Center the scaled image on the canvas.
Normalization:
- Convert pixel values to floats in the range [0, 1] (divide by 255).
- Arrange data in NCHW format (Channels, Height, Width) with RGB channel order.
  - Conversion to grayscale and back to RGB may be needed for optimal performance on colored pages.
- The final input tensor shape is [1, 3, 224, 224].

Performance metrics for the FP16 model (INT8 dynamic is about 1.5x faster):