OpenAI CLIP: ConnectingText and Images (Paper Explained) Yannic Kilcher Kho Tổng Hợp 173,203 5 năm trước Add Nghe mp3 Facebook Tweet XEM MÔ TẢ #ai #openai #technology Paper Title: Learning Transferable Visual Models From Natural Language Supervision CLIP trains on 400 million images scraped from the web, along with text descriptions to learn a model that can connect the two modalities. The core idea is a contrastive objective combined with a large batch size. The resulting model can be turned into arbitrary zero-shot classifiers for new image & text tasks. OUTLINE: 0:00 - Introduction 3:15 - Overview 4:40 - Connecting Images & Text 9:00 - Building Zero-Shot Classifiers 14:40 - CLIP Contrastive Training Objective 22:25 - Encoder Choices 25:00 - Zero-Shot CLIP vs Linear ResNet-50 31:50 - Zero-Shot vs Few-Shot 35:35 - Scaling Properties 36:35 - Comparison on different tasks 37:40 - Robustness to Data Shift 44:20 - Broader Impact Section 47:00 - Conclusion & Comments Paper: https://cdn.openai.com/papers/Learning_Transferable_Visual_Models_From_Natural_Language_Supervision.pdf Blog: https://openai.com/blog/clip/ Code: https://github.com/openai/CLIP Abstract: State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstrate that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet. After pre-training, natural language is used to reference learned visual concepts (or describe new ones) enabling zero-shot transfer of the model to downstream tasks. We study the performance of this approach by benchmarking on over 30 different existing computer vision datasets, spanning tasks such as OCR, action recognition in videos, geo-localization, and many types of fine-grained object classification. The model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need for any dataset specific training. For instance, we match the accuracy of the original ResNet-50 on ImageNet zero-shot without needing to use any of the 1.28 million training examples it was trained on. Authors: Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n Video liên quan 25:22 How to Build a High-Pressure Non-Electric Water Pump Yt Crop - DIY Crafts 4,120,761 view 1 năm trước Add 13:36 DIY Camp/Overlanding Shower Walk Around Overland Addict 272,755 view 7 năm trước Add 13:48 DIY Solar Shower System Pressurized for Roof Rack Camping off the grid under $100 NatenessTV 250,335 view 5 năm trước Add 37:49 DIY Pressurized Solar Shower for Roof Rack / water tank for camping and overlanding / LR Discovery LR TIME 61,981 view 4 năm trước Add 15:39 DIY Vehicle Water Storage: Roof Top, Pressurised, Hot Water All the Gear. No Idea... 16,699 view 2 năm trước Add 18:07 Ukryta prawda o burzliwym dzieciństwie Kazika Staszewskiego Kurtyna - historie zza sceny 37,400 view 6 tháng trước Add 22:15 Lady Pank: Wielka Kariera i Jeszcze Większe Skandale! Kurtyna - historie zza sceny 183,851 view 10 tháng trước Add 27:33 ✝️ ZNANI POLACY - AKTORZY, AKTORKI, CELEBRYCI I PIOSENKARZE KTÓRZY ZMARLI W 2025 ROKU Kto umarł? 77,181 view 6 tháng trước Add 19:49 Od Przestępstw do Grammy: Niewyobrażalna Droga P!nk Kurtyna - historie zza sceny 46,813 view 7 tháng trước Add 45:36 She Followed Her Canadian Fiancé To Vancouver – What He Did To Her In The Mountains Was CRUEL... Criminal Minds USA 70,948 view 6 tháng trước Add 46:49 Barry Sheene World Champion Theo den Hugt 832,342 view 7 năm trước Add 16:36 Karen Wasnt Ready For This | Guns N Roses - November Rain | (Karens First Time Reaction) Cliff Beats Classics 443,884 view 12 tháng trước Add 8:49 Emotional...First Time Listening to GUNS NROSES | Sweet Child OMine Reaction DeaDevi 51,559 view 10 tháng trước Add 23:39 The Lucy Show - Season 5 - Episode 15 - Viv Visits Lucy The Film Detective 118,510 view 7 năm trước Add 25:27 O.E.E. presents; The Lucy Show-Lucy and the substitute secretary Outlaw East Entertainment 262,904 view 14 năm trước Add 1:21:41 Cat TV for Cat to Watch 🐦🐿️Spring Birds & Squirrels Enjoying a Feast in the Trees| Relaxing for Cats HihiHappy 641 view 5 tháng trước Add 1:32:24 Cat TV for Cats to Watch 🐿️ Playful Squirrels and Birds in a Sunny Meadow 4K Feline Flix 5,009 view 7 tháng trước Add 30:27 12 Actores G4Y$ del Cine de Oro que AMARON en SECRETO Cine de Oro Mexicano 37,314 view 5 tháng trước Add 45:19 The Rise And Fall Of Duolingo Joseph Carlson After Hours 80,308 view 5 tháng trước Add 10:32 1st Time Hearing Guns N Roses “Paradise City”(Hood Girl Reaction) K Shavon Reacts 26,743 view 3 năm trước Add