A representation learning neural network is a class of neural architectures designed to automatically learn useful feature representations of data (such as images, text, audio, or tabular data) without requiring manual feature engineering. Instead of relying on hand-crafted features, these models discover latent structures and embeddings that make downstream tasks like classification, retrieval, or generation more effective. Representation learning is foundational to modern deep learning and underpins many state-of-the-art models in vision, language, and multimodal AI.
Focuses on reconstructing inputs through a bottleneck layer to learn compressed representations, often simpler and more interpretable for certain modalities.
Use contrastive objectives to bring similar samples closer and push dissimilar ones apart in embedding space, often achieving strong performance with unlabeled data.
Learns representations directly optimized for a specific labeled task, without explicit focus on general-purpose embeddings.