#fine-tuning tag

3 months ago

Artificial intelligence

Olmo 3 Release Provides Full Transparency Into Model Development and Training

fromInfoWorld

1 month ago

Artificial intelligence

16 open source projects transforming AI and machine learning

3 months ago

Artificial intelligence

Olmo 3 Release Provides Full Transparency Into Model Development and Training

more#open-source

fromThe New Yorker

2 months ago

What if Readers Like A.I.-Generated Fiction?

Fine-tuning a large language model on a full translated corpus yields more nuanced, emotionally resonant, and stylistically accurate outputs.

fromTechCrunch

3 months ago

Mistral closes in on Big AI rivals with new open-weight frontier and small models | TechCrunch

The launch comes as Mistral, which develops open-weight language models and a Europe-focused AI chatbot Le Chat, has appeared to be playing catch up with some of Silicon Valley's closed source frontier models. The two-year-old startup, founded by former DeepMind and Meta researchers, has raised roughly $2.7 billion to date at a $13.7 billion valuation - peanuts compared to the numbers competitors like OpenAI ($57 billion raised at a $500 billion valuation) and Anthropic ($45 billion raised at a $350 billion valuation) are pulling.

Artificial intelligence

fromInfoWorld

5 months ago

Selective retraining helps AI learn new skills without forgetting, study finds

These experiments led to two key discoveries, according to the paper. Tuning only the self-attention projection layers (SA Proj), the part of the model that helps it decide which input elements to focus on, allowed the models to learn new tasks with little or no measurable forgetting. Also, what initially appeared as forgotten knowledge often resurfaced when the model was later trained on another specialized task.

Artificial intelligence

5 months ago

Thinking Machines Releases Tinker API for Flexible Model Fine-Tuning

Thinking Machines released Tinker, an API that simplifies fine-tuning open-weight language models by managing infrastructure, GPU allocation, and checkpoints through simple Python primitives.

Philosophy

fromAeon

5 months ago

Are observers fundamental to physics, or simply byproducts of it? | Aeon Videos

Observers may be central to physics or merely byproducts of phenomena independent of consciousness; quantum experiments and fine-tuning considerations probe their significance.

fromTechCrunch

6 months ago

'Selling coffee beans to Starbucks' - how the AI boom could leave AI's biggest companies behind | TechCrunch

Foundation models are increasingly commoditized; fine-tuning, post-training methods, and user-facing interfaces now drive competitive advantage in many AI businesses.

6 months ago

GenAI at Scale: What It Enables, What It Costs, and How To Reduce the Pain

My name is Mark Kurtz. I was the CTO at a startup called Neural Magic. We were acquired by Red Hat end of last year, and now working under the CTO arm at Red Hat. I'm going to be talking about GenAI at scale. Essentially, what it enables, a quick overview on that, costs, and generally how to reduce the pain. Running through a little bit more of the structure, we'll go through the state of LLMs and real-world deployment trends.

Artificial intelligence

fromThe JetBrains Blog

6 months ago

Fine-Tuning and Deploying GPT Models Using Hugging Face Transformers | The PyCharm Blog

Fine-tuning pre-trained GPT models customizes performance for domain-specific math tasks, improving accuracy and efficiency while reducing training time and resources.

7 months ago

Unsloth Tutorials Aim to Make it Easier to Compare and Fine-tune LLMs

Qwen3-Coder-480B-A35B delivers SOTA advancements in agentic coding and code tasks, matching or outperforming Claude Sonnet-4, GPT-4.1, and Kimi K2. The 480B model achieves a 61.8% on Aider Polygot and supports a 256K token context, extendable to 1M tokens.

Artificial intelligence

fromTechzine Global

7 months ago

Google expands Gemma family with compact 270M variant

Google's Gemma 3 270M is a compact AI model designed for efficient, task-specific fine-tuning at lower operational costs.

2 years ago

On Grok and the Weight of Design | HackerNoon

Targeted fine-tuning can lead to systemic behavioral distortions in large-scale models.

#qdylora

56 years ago

Artificial intelligence

The Last Rank We Need? QDyLoRA's Vision for the Future of LLM Tuning | HackerNoon

Artificial intelligence

Beyond Static Ranks: The Power of Dynamic Quantization in LLM Fine-Tuning | HackerNoon

56 years ago

Artificial intelligence

The Last Rank We Need? QDyLoRA's Vision for the Future of LLM Tuning | HackerNoon

Artificial intelligence

Beyond Static Ranks: The Power of Dynamic Quantization in LLM Fine-Tuning | HackerNoon

more#qdylora

QDyLoRA in Action: Method, Benchmarks, and Why It Outperforms QLoRA | HackerNoon

Quantized DyLoRA achieves superior performance in model fine-tuning tasks compared to previous techniques.

56 years ago

Keep the Channel, Change the Filter: A Smarter Way to Fine-Tune AI Models | HackerNoon

Efficient fine-tuning of large pre-trained models can be achieved by adjusting only filter atoms while preserving overall model capabilities.

Tuning the Pixels, Not the Soul: How Filter Atoms Remake ConvNets | HackerNoon

Pre-training models on large datasets enhances their performance through fine-tuning for specific tasks.

fromLogRocket Blog

9 months ago

Fine-tuning vs. RAG: Which AI strategy fits your frontend project? - LogRocket Blog

Fine-tuning provides consistent and fast responses, but requires lengthy retraining for updates, while RAG offers instant updates but involves handling latency and interface challenges.

Artificial intelligence