Models Generic PE-Core L/14
Meta

PE-Core L/14

Vision foundation model trained with contrastive vision-language objective.

Meta Open Source Vision Only
Generic
Model Type
#2
Overall Rank
42.9%
Avg P@1
34.0%
Avg mAP@10
1024
Embed Dim
336
Input Res
8
Datasets

About This Model

Overview

The Perception Encoder (PE) is a next-generation vision foundation model developed by Meta AI. It is part of a family of image and video encoders trained using a large-scale contrastive vision-language objective. Unlike earlier CLIP-style models, PE introduces improved training recipes, larger curated datasets, and alignment strategies that extract embeddings not only from the final layer but also from highly informative intermediate layers.

Architecture

The L-14/336 variant used in this study is built on a Vision Transformer Large architecture with a 336×336 image resolution. It produces 1,024-dimensional image embeddings designed to generalize broadly across classification, retrieval, and dense vision tasks.

Training

The model is trained on billions of image-text pairs and further refined with video-based alignment, resulting in strong robustness to domain shift and high-quality global representations.

Evaluation Setup

Although Perception Encoder is capable of multimodal and video-aware representation learning, in our evaluation we use it purely as an image-to-image retrieval encoder. This allows us to assess how well a modern, high-capacity vision-only foundation model performs in fine-grained industrial instance retrieval, where distinguishing between visually similar parts is critical.

Performance Across Datasets

Dataset Category P@1 P@5 R@1 R@5 mAP@10
VPRC 2023 Mixed Retail 32.97% 15.19% 21.90% 45.15% 36.49%
Intercars Automotive 18.82% 16.59% 6.34% 20.62% 21.38%
Stanford Online Products E-commerce 80.09% 54.56% 19.83% 52.67% 57.95%
IKEA Furniture 38.27% 25.28% 9.16% 25.50% 25.14%
Hornbach Hardware/DIY 25.20% 9.78% 25.20% 48.88% 33.80%
ARaymond Industrial 12.14% 8.01% 0.76% 2.50% 3.31%
Products-10K E-commerce 65.63% 38.80% 13.99% 40.74% 39.50%
TOPEX Industrial 69.87% 65.48% 2.18% 10.23% 54.74%
Average 42.87% 29.21% 12.42% 30.79% 34.04%