# QRAF

**Repository Path**: ByteDance/QRAF

## Basic Information

- **Project Name**: QRAF
- **Description**: Official implementation of "QRAF: a Quantization-error-aware Rate Adaption Framework for Learned Image Compression"
- **Primary Language**: Unknown
- **License**: BSD-3-Clause-Clear
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-01-19
- **Last Updated**: 2025-06-17

## Categories & Tags

**Categories**: image-processing

**Tags**: None

## README

# QVRF: A QUANTIZATION-ERROR-AWARE VARIABLE RATE FRAMEWORK FOR LEARNED IMAGE COMPRESSION

Official implementation of "QVRF: A QUANTIZATION-ERROR-AWARE VARIABLE RATE FRAMEWORK FOR LEARNED IMAGE COMPRESSION"

## Table of Contents
- [Environment](#Environment)
- [Dataset](#Dataset)
- [Inference](#Inference)
- [RD Results](#RD Results)

# Environment
Recommend using Miniconda.
```
    #python>=3.6 should be fine.
    conda create -n qraf python=3.8
    conda activate qraf
    pip install compressai==1.1.5
    #pip install compressai==1.1.5 -i https://pypi.mirrors.ustc.edu.cn/simple/
 ```  
# Dataset
```
   mkdir dataset
   mv Kodak ./dataset
```
Download [Collection of Kodak](https://drive.google.com/file/d/1Fst3a0naKWx28zX--kDB5G_T6Kyec9R6/view?usp=sharing)
or [Kodak](http://r0k.us/graphics/kodak/) for Testing.

# Training
Using Trainningdataset_Preprocessing.py to select the largest 8000 images from [imageNet](http://www.image-net.org/challenges/LSVRC/2012/dd31405981ef5f776aa17412e1f0c112/ILSVRC2012_img_train.tar) and 584 images from [CLIC2020](https://data.vision.ee.ethz.ch/cvl/clic/professional_train_2020.zip) and to preprocess the images as the training dataset.
## Parameters
- `dataset`: dir. "Directory of training and validation dataset."

- `epochs`: int, default `1000` "Number of epochs"

- `learning-rate`: float, default `1e-4` "Learning rate."

- `num-workers`: int, default `4` "Dataloaders threads"

- `batch-size`: int, default `16` "Batch size"

- `test-batch-size`: int, default `64` "Test batch size"

- `learning-rate`: float, default `1e-4` "Learning rate."

- `patch-size`: int,  default `256 256` "Size of the patches to be cropped."

- `cuda`: "Use cuda."

- `save`: "Save model to disk."

- `seed`:  float, default `1926` "Set random seed for reproducibility"

- `clip_max_norm`:  float, default `1.0` "Set random seed for reproducibility"

- `checkpoint`: str, "Checkpoint path."

- `stage`: int,  default `1` "Trainning stage."

- `ste`: int,  default `0` "Using ste round in the finetune stage"

- `loadFromPretrainedSinglemodel`: int,  default `0` "Load fixed-rate model "

- `refresh`: int,  default `0` "Refresh the setting of optimizer and epoch"

## training Stage1
```
python3 train.py -d ./dataset  -e 2000 -lr 1e-4 -n 8 --batch-size 8 --test-batch-size 64 --aux-learning-rate 1e-3 --patch-size 256 256 --cuda --save --seed 1926 --clip_max_norm 1.0  --stage 1 --ste 0  --loadFromPretrainedSinglemodel 0 
```
## training Stage2
```
python3 train.py  -d ./dataset  -e 500 -lr 1e-4 -n 8 --batch-size 8 --test-batch-size 64 --aux-learning-rate 1e-3 --patch-size 256 256 --cuda --save --seed 1926 --clip_max_norm 1.0  --stage 2 --ste 0  --refresh 1 --loadFromPretrainedSinglemodel 0 --checkpoint checkpoint_best_loss.pth.tar |tee Cheng2020Noise.txt
```
Actually, you can load a fixed-rate model and finetune it with QVRF.
```
python3 train.py  -d ./dataset  -e 500 -lr 1e-4 -n 8 --batch-size 8 --test-batch-size 64 --aux-learning-rate 1e-3 --patch-size 256 256 --cuda --save --seed 1926 --clip_max_norm 1.0  --stage 2 --ste 0  --refresh 1 --loadFromPretrainedSinglemodel 1 --checkpoint cheng2020_attn-mse-6-730501f2.pth.tar |tee Cheng2020Noise.txt
```
## training Stage3
```
python3 train.py  -d ./dataset  -e 500 -lr 1e-4 -n 8 --batch-size 8 --test-batch-size 64 --aux-learning-rate 1e-3 --patch-size 256 256 --cuda --save --seed 1926 --clip_max_norm 1.0  --stage 3 --ste 1 --refresh 1 --loadFromPretrainedSinglemodel 0 --checkpoint checkpoint_best_loss.pth.tar |tee Cheng2020STE.txt
```
# Update
Once the model is trained, we need to run update.py to fix the entropy model

- `name`: str, "Exported model name"

- `dir`: str, "Exported model directory."
```commandline
python3 update.py checkpoint_best_loss.pth.tar   -n Cheng2020VR
```
# Inference
Download checkpoint  [variable rate model of Cheng2020](https://drive.google.com/file/d/1aydW2y2yohjD-cfcQKZv-SIO_FhDF4M-/view?usp=sharing) for Inference.
## Parameters
- `dataset`: str, "Test dataset path."

- `s`: int, default `2` "Discrete bitrate index."

- `output_path`: str, "The name of reconstructed dir."

- `p`: str, "Checkpoint path."

- `patch`: int, default `64`. "Padding size."

- `factormode`: int, between `[0, 1]`, default `0`. "Whether to choose continuous bitrate adaption."

- `factor`: float between `[0.5, 12]`, default `1.5`.  "Reciprocal of continuous bitrate quantization bin size."

## Inference code
```
    python3 Inference.py --dataset TestDataset --s 2 --output_path output_pathName -p CheckpointPath --patch 64 --factormode 1 --factor 0.1
```

## Example of Inference.py
### Discrete bitrate results
For all discrete bitrate results:
```
    python3 Inference.py --dataset ./dataset/Kodak --s 8 --output_path AttentionVRSTE -p ./Cheng2020VR.pth.tar --patch 64 --factormode 0 --factor 0
```
For  discrete bitrate results at a assign Index: Index belongs in {0, 1, 2, 3, 4, 5, 6, 7}
```
    python3 Inference.py --dataset ./dataset/Kodak --s Index --output_path AttentionVRSTE -p ./Cheng2020VR.pth.tar --patch 64 --factormode 0 --factor 0
```
### Continuous bitrate results
For example continuous bitrate  results:
```
    python3 Inference.py --dataset ./dataset/Kodak --s 2 --output_path AttentionVRSTE -p ./Cheng2020VR.pth.tar --patch 64 --factormode 1 --factor 0.1
```
Change arbitrary quantization bin size in the range of [0.5, 12] at a quantization bin size 1/QBS.  (1/QBS in range of [0.5, 12])
```
    python3 Inference.py --dataset ./dataset/Kodak --s 2 --output_path AttentionVRSTE -p ./Cheng2020VR.pth.tar --patch 64 --factormode 1 --factor 1/QBS
```
### Note
A higher bitrate corresponds to a larger factor value which is the  reciprocal of quantization bin size.

## Discrete/Continuous Variable Rate Results
![](assets/VariableRate.png)
## Note
From public code and paper, the models of Cheng2020 only trained for the low and medium rate with lambda belonging to {0.0016, 0.0032, 0.0075, 0.015, 0.03, 0.045}. 

We re-trained Cheng2020 on our training dataset following the original paper setting with lambda belonging to {0.0018, 0.0035, 0.0067, 0.0130, 0.0250, 0.0483, 0.0932, 0.1800} for a fair comparison.
## Improvement of QVRF
Using the predefined Lagrange multiplier set  {0.0018, 0.0035, 0.0067, 0.0130, 0.025, 0.0483, 0.0932, 0.18, 0.36, 0.72, 1.44} with  can achieve better results in high bitrate.
![](assets/moreLagrangeMultiplier.png)

You can download the [checkpoint](https://drive.google.com/file/d/1jMJxvT-YFJ8L3W-mx4-yjBz72MfgMC3e/view?usp=sharing) for testing the result with the predefined Lagrange multiplier set  {0.0018, 0.0035, 0.0067, 0.0130, 0.025, 0.0483, 0.0932, 0.18, 0.36, 0.72, 1.44}.
# RD Results
We used the 8 pretrained discrete models for Balle et al., Minnen et al. from [compressai](https://github.com/InterDigitalInc/CompressAI) as the benchmark.

We re-trained 8 Cheng2020 models on our training dataset from low bitrate and high bitrate (all the models use channel numbers of 192 for comparison).
## RD Curve On Kodak dataset with 24 images
### Comparison of variable rate methods. The baseline is Balle and use 8 fix-rate models as anchor
![](assets/Balle.png)
### Comparison of variable rate methods.The baseline is Minnen and use 8 fix-rate models as anchor
![](assets/Minnen.png)
### Comparison of variable rate methods.The baseline is Cheng2020 and use 8 fix-rate models as anchor
![](assets/Cheng2020.png)