Image Quality Assessment

Dual-branch vision transformer for BIQA — Fig. 1. The proposed dual-branch vision transformer for blind image quality assessment (BIQA).

Blind image quality assessment (BIQA) aims to predict the perceptual quality of an image without access to a reference. We propose a dual-branch vision transformer that simultaneously considers both local distortions and global semantic information. Dual-scale features (S-Feature and L-Feature) are extracted from a ResNet-50 backbone and fed into separate transformer encoder branches. Each branch captures scale-variant local distortions through local feature embeddings, and jointly models global distortion context via content-aware IQA (CA-IQA) embeddings. The outputs of both branches are combined through feed-forward blocks to predict the final image quality score.

TABLE 1. Average SRCC results on six IQA databases. Best and second-best results are **bold** and underlined, respectively.
Method	SRCC
Method	LIVEC	TID2013	LIVE	CSIQ	LIVE MD	KADID-10k
BRISQUE	0.608	0.604	0.939	0.746	0.886	0.528
M3	0.607	0.689	0.951	0.795	0.892	-
FRIQUEE	0.682	0.680	0.940	0.835	0.923	-
CORNIA	0.629	0.678	0.947	0.678	0.899	-
HOSA	0.640	0.735	0.946	0.741	0.913	-
Le-CNN	-	-	0.956	-	-	-
BIECON	0.595	0.717	0.961	0.815	0.909	0.623
DIQaM-NR	0.606	0.835	0.960	-	-	-
WaDIQaM-NR	0.671	0.761	0.954	-	-	0.739
ResNet-ft	0.819	0.712	0.950	0.876	0.909	-
IW-CNN	0.663	0.800	0.963	0.812	0.914	-
DBCNN	0.851	0.816	0.968	0.946	0.927	0.851
HyperIQA	0.859	0.797	0.962	0.923	0.898	0.852
TReS	0.846	0.863	0.969	0.922	0.916	0.859
BIQA, M.D.	-	0.835	0.969	0.903	-	-
RNSA	0.871	0.849	0.969	0.931	-	0.855
Proposed	0.862	0.877	0.976	0.942	0.935	0.970

TABLE 2. Average PLCC results on six IQA databases. Best and second-best results are **bold** and underlined, respectively.
Method	PLCC
Method	LIVEC	TID2013	LIVE	CSIQ	LIVE MD	KADID-10k
BRISQUE	0.629	0.694	0.935	0.829	0.917	0.567
M3	0.630	0.771	0.950	0.839	0.919	-
FRIQUEE	0.705	0.753	0.944	0.874	0.934	-
CORNIA	0.671	0.768	0.950	0.776	0.921	-
HOSA	0.678	0.815	0.947	0.823	0.926	-
Le-CNN	-	-	0.953	-	-	-
BIECON	0.613	0.762	0.962	0.823	0.933	0.648
DIQaM-NR	0.601	0.855	0.972	-	-	-
WaDIQaM-NR	0.680	0.787	0.963	-	-	0.752
ResNet-ft	0.849	0.756	0.954	0.905	0.920	-
IW-CNN	0.705	0.802	0.964	0.791	0.929	-
DBCNN	0.869	0.865	0.971	0.959	0.869	0.856
HyperIQA	0.882	0.823	0.966	0.942	0.924	0.845
TReS	0.877	0.883	0.968	0.942	0.921	0.858
BIQA, M.D.	-	0.859	0.978	0.925	-	-
RNSA	0.883	0.861	0.972	0.959	-	0.859
Proposed	0.882	0.894	0.976	0.952	0.935	0.971

Publications

Se-Ho Lee and Seung-Wook Kim, “Dual-branch vision transformer for blind image quality assessment,” Journal of Visual Communication and Image Representation, vol. 94, pp. 103850, Jun. 2023. [DOI]

Dual-Branch Vision Transformer for Blind Image Quality Assessment