Prev play stop Next mute max volume 00:00 00:00 repeat Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin. OpenAI CLIP: ConnectingText and Images (Paper Explained) Yannic Kilcher Kho Tổng Hợp 173,202 5 năm trước Xem video Facebook Tweet XEM MÔ TẢ #ai #openai #technology Paper Title: Learning Transferable Visual Models From Natural Language Supervision CLIP trains on 400 million images scraped from the web, along with text descriptions to learn a model that can connect the two modalities. The core idea is a contrastive objective combined with a large batch size. The resulting model can be turned into arbitrary zero-shot classifiers for new image & text tasks. OUTLINE: 0:00 - Introduction 3:15 - Overview 4:40 - Connecting Images & Text 9:00 - Building Zero-Shot Classifiers 14:40 - CLIP Contrastive Training Objective 22:25 - Encoder Choices 25:00 - Zero-Shot CLIP vs Linear ResNet-50 31:50 - Zero-Shot vs Few-Shot 35:35 - Scaling Properties 36:35 - Comparison on different tasks 37:40 - Robustness to Data Shift 44:20 - Broader Impact Section 47:00 - Conclusion & Comments Paper: https://cdn.openai.com/papers/Learning_Transferable_Visual_Models_From_Natural_Language_Supervision.pdf Blog: https://openai.com/blog/clip/ Code: https://github.com/openai/CLIP Abstract: State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstrate that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet. After pre-training, natural language is used to reference learned visual concepts (or describe new ones) enabling zero-shot transfer of the model to downstream tasks. We study the performance of this approach by benchmarking on over 30 different existing computer vision datasets, spanning tasks such as OCR, action recognition in videos, geo-localization, and many types of fine-grained object classification. The model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need for any dataset specific training. For instance, we match the accuracy of the original ResNet-50 on ImageNet zero-shot without needing to use any of the 1.28 million training examples it was trained on. Authors: Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n Mp3 liên quan 22:18 How to Start an Ecommerce Business in 2025 (FREE COURSE) Andy Stauring 333,563 view 1 năm trước 4:48 Are We in an Internet Bubble Like 1999? Bloomberg Technology 1,938 view 9 năm trước 9:06 Xôi chiên - cách làm xôi chiên để qua ngày sau vẫn còn mềm xốp / how to make fried sticky rice. CAMVAN 5,629 view 2 năm trước 8:19 BỘT CHIÊN TỪ CƠM NGUỘI | MÓN NGON NGÀY MỚI TV Chuyên Sửa Đồng Hồ Xe Máy 1,448 view 4 năm trước 20:45 ESBICM Unfiltered: 2025 Confession, 2026 Roadmap & My Thoughts on AI in Intensive Care The ICU Channel by ESBICM 637 view 3 tháng trước 0:09 Bella Hadid’s reaction to seeing Robert De Niro at Cannes 2025. Brut America 95,977 view 11 tháng trước 12:19 công thức và cách làm nước chấm ớt sim xanh, chấm hải sản |Anh Lee BTR tập 10 Anh Lee BTR 443,762 view 6 năm trước 4:20 color grading trending video editing | TikTok viral video editing | trending video editing Simple Munna Bhai 188,781 view 1 năm trước 23:41 The Smoothest Movement Faide 225,080 view 4 tháng trước 26:06 🏎️The Day the Champion Racer Lost His Bride [EP1-12] | New Release | Reelshort ReelShort 104,769 view 12 ngày trước 58:54 TÓM TẮT TOÀN BỘ KINH THÁNH TRONG 58 PHÚT | Dat Hoang | Thế Giới Spiderum 338,257 view 3 tháng trước 2:21 Awkwafina and Melissa McCarthy present Best Musical/Comedy Series | 82nd Annual Golden Globes Golden Globes 264,094 view 1 năm trước 7:48 Liên khúc MƯA - PHƯƠNG DIỄM HUYỀN & HOÀI THƯƠNG | Ai Nghe Cũng Khen Cặp Chị Em Song Ca Này Phương Diễm Huyền 1,279,982 view 8 tháng trước 5:27 Karaoke Đèn Khuya Tone Nam Nhạc Sống Dễ Hát | Nguyễn Linh Nhạc sống Nguyễn Linh 42,721 view 8 tháng trước 5:27 Highguard Is Getting Destroyed in User Reviews - IGN Daily Fix IGN 32,594 view 2 tháng trước 0:54 Beyoncé - SWEET/BUCKIN’ Live | #beyoncebowl Mr. Executive 54,393 view 1 năm trước 1:58:02 The Heirs of Wealthy Families Who Became Monsters (Documentary) Old Money Documentaries 104,639 view 2 tháng trước 2:19:59 ORomeo (2026) Latest Hindi Full Movie | Starring Shahid Kapoor, Triptii Dimri, Nana P, Avinash T RG Entertainment 2,040,468 view 23 ngày trước 52:45 70s Best Disco, Funk & RnB Hits Vol.1 (Serega Bolonkin Video Mix) │ Лучшие танцевальные хиты 70-х Serega Bolonkin 81,662,527 view 4 năm trước 17:47 Holiday Family Vlog | Trip To Kgari - Fraiser Island | Taniberlo Tani Caesar 434 view 3 năm trước Thể loại nhạc