ICML Integrating ViT-Derived Visual Semantics into Credit Scoring Models for Informal Businesses in Latin America

Poster
in
Affinity Workshop: LatinX in AI

Integrating ViT-Derived Visual Semantics into Credit Scoring Models for Informal Businesses in Latin America

Alejandro Mildiner · Michael Moreno · Viviana Siless

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

Access to formal credit remains a significant barrier for micro-entrepreneurs in Latin America. This work introduces a novel credit scoring methodology that integrates image-based cluster features extracted via Vision Transformer (ViT) models into a tabular classification pipeline. Using images submitted by loan applicants, we generate high-dimensional embeddings with ViT (Dosovitskiy et al., 2021), CLIP (Radford et al.,2021), and I-JEPA (Assran et al., 2023), and apply unsupervised clustering to discover latent visual patterns correlated with creditworthiness. Each applicant is then encoded based on their proximity to learned clusters; yielding categorical features that represent visual similarity to known payer profiles. These features are incorporated into XGBoost models alongside financial and demographic data. Our results show that visual-cluster-based features improve predictive performance and, they outperform a baseline model utilizing traditional indicators from credit bureaus and alternative data (AUC .79), reaching AUC .843 in the case of I-JEPA. This approach demonstrates how computer vision can provide interpretable, transferable insights from visual content, offering a new pathway toward fairer, more inclusive credit evaluation in underserved economies (Salcedo-Perez & Patino, 2018).

Chat is not available.

Poster in Affinity Workshop: LatinX in AI

Integrating ViT-Derived Visual Semantics into Credit Scoring Models for Informal Businesses in Latin America

Alejandro Mildiner · Michael Moreno · Viviana Siless

Poster
in
Affinity Workshop: LatinX in AI