Audio Visual Language Model

Microsoft unveils AI model that understands image content, solves visual puzzles

On Monday, researchers from Microsoft introduced Kosmos-1, a multimodal model that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ ...

12d

Google's new open source Gemma 4 12B analyzes audio, video — and runs entirely locally on a typical 16GB enterprise laptop

For enterprise leaders aiming to decentralize their AI workloads, Gemma 4 12B offers a rare combination of edge-friendly ...

Ars Technica

Google’s PaLM-E is a generalist robot brain that takes commands

On Monday, a group of AI researchers from Google and the Technical University of Berlin unveiled PaLM-E, a multimodal embodied visual-language model (VLM) with 562 billion parameters that integrates ...

Memeburn

Google's Gemma 4 12B Runs AI Natively on Your Laptop — No Cloud Needed

Google's Gemma 4 12B brings multimodal AI — audio, video, and text — to a standard 16GB laptop in 2026. No cloud required. Here's what it does and why it matters.

SiliconANGLE

Audio language model startup Gradium raises $70M to create more realistic voice AI systems

Audio artificial intelligence startup Gradium is launching today after closing on an impressive $70 million seed funding round, just three months after it was founded. The startup is backed by ...

VentureBeat

China's Alibaba challenges U.S. tech giants with open source Qwen3-Omni AI model accepting text, audio, image and video

U.S. tech giants are facing a reckoning from the East. Even as Nvidia pledged today to invest a staggering $100 billion into its own customer OpenAI's data centers — a move that raised eyebrows across ...

techtimes

Kling AI Unveils Unified Multimodal Video Model O1 and Video 2.6 to Reshape Creative Production

Kling AI, an AI-powered creative platform, is rolling out a suite of generative AI models designed to streamline how visual and audio content are made, a move that underscores the company's efforts to ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results