#multimodal-models

[ follow ]
fromHackernoon
1 year ago

What 34 Vision-Language Models Reveal About Multimodal Generalization | HackerNoon

We delved into the five pretraining datasets of 34 multimodal vision-language models, analyzing the distribution and composition of concepts within, generating over 300GB of data artifacts that we publicly release.
Artificial intelligence
Artificial intelligence
fromHackernoon
1 year ago

Analyzing the Impact of Pretraining Frequency on Zero-Shot Performance in Multimodal Models | HackerNoon

Pretraining concept frequency is predictive of zero-shot performance across various multimodal models.
Data science
fromHackernoon
1 year ago

The Science Behind Many-Shot Learning: Testing AI Across 10 Different Vision Domains | HackerNoon

Increasing the number of demonstrating examples significantly enhances the performance of multimodal foundation models like GPT-4o and Gemini 1.5 Pro.
#in-context-learning
Data science
fromHackernoon
1 year ago

Scientists Just Found a Way to Skip AI Training Entirely. Here's How | HackerNoon

Many-shot ICL enhances multimodal foundation model performance across datasets, reducing latency and inference costs while allowing practical adaptation to new tasks.
#ai
fromInfoQ
2 months ago
Artificial intelligence

Gemma 3n Available for On-Device Inference Alongside RAG and Function Calling Libraries

Gemma 3n is a multimodal AI model enhancing enterprise efficiency through mobile device utilization.
fromInfoQ
2 months ago
Artificial intelligence

Gemma 3n Available for On-Device Inference Alongside RAG and Function Calling Libraries

fromTechzine Global
2 months ago

GPT-5 aims to end AI model overgrowth at OpenAI

OpenAI plans to consolidate AI models into a single seamless model with the release of GPT-5.
User frustration with current AI model diversity motivates the development of GPT-5.
Artificial intelligence
fromZDNET
2 months ago

Multimodal AI poses new safety risks, creates CSEM and weapons info

Multimodal AI enhances LLMs but increases their vulnerability to novel attacks.
New research indicates significant safety risks with multimodal models, exposing them to dangerous outputs.
Gadgets
fromFast Company
4 months ago

OpenAI brings AI image generation directly to ChatGPT

OpenAI introduces integrated image generation in ChatGPT, enhancing user interaction with visuals via natural language prompts.
[ Load more ]