#model-training

[ follow ]
#machine-learning
OMG science
fromInfoWorld
1 month ago

How DeepSeek innovated large language models

DeepSeek's innovative models redefine performance benchmarks with advanced techniques for precision training and reasoning.
Data science
fromTechCrunch
4 months ago

The promise and perils of synthetic data | TechCrunch

AI can effectively be trained on data generated by other AIs, hinting at a shift toward synthetic data in modeling.
The reliance on AI-generated synthetic data is growing as access to diverse real-world datasets tightens.
fromHackernoon
1 month ago
Artificial intelligence

Our Analysis on Think-and-Execute and Pseudocode | HackerNoon

Task-level pseudocode significantly improves reasoning performance compared to instance-specific logic.
Pre-training on code corpora enhances the understanding of task-level logic.
fromHackernoon
4 months ago
Artificial intelligence

This AI Doesn't Just Skim Scientific Papers-It Tags, Sorts, and Explains Them Too | HackerNoon

The article discusses the training and evaluation of various LLM models with a focus on NER and internal relation recognition.
fromTechCrunch
4 months ago
Artificial intelligence

A popular technique to make AI more efficient has drawbacks | TechCrunch

Quantization may degrade performance in AI models, especially in larger models trained on extensive data.
fromHackernoon
4 months ago
Online learning

Direct Nash Optimization Beats Bigger Models with Better Data | HackerNoon

Offline contrastive training provides more valuable signals for model performance than traditional supervised fine-tuning methods.
OMG science
fromInfoWorld
1 month ago

How DeepSeek innovated large language models

DeepSeek's innovative models redefine performance benchmarks with advanced techniques for precision training and reasoning.
Data science
fromTechCrunch
4 months ago

The promise and perils of synthetic data | TechCrunch

AI can effectively be trained on data generated by other AIs, hinting at a shift toward synthetic data in modeling.
The reliance on AI-generated synthetic data is growing as access to diverse real-world datasets tightens.
fromHackernoon
1 month ago
Artificial intelligence

Our Analysis on Think-and-Execute and Pseudocode | HackerNoon

Task-level pseudocode significantly improves reasoning performance compared to instance-specific logic.
Pre-training on code corpora enhances the understanding of task-level logic.
fromHackernoon
4 months ago
Artificial intelligence

This AI Doesn't Just Skim Scientific Papers-It Tags, Sorts, and Explains Them Too | HackerNoon

The article discusses the training and evaluation of various LLM models with a focus on NER and internal relation recognition.
fromTechCrunch
4 months ago
Artificial intelligence

A popular technique to make AI more efficient has drawbacks | TechCrunch

Quantization may degrade performance in AI models, especially in larger models trained on extensive data.
fromHackernoon
4 months ago
Online learning

Direct Nash Optimization Beats Bigger Models with Better Data | HackerNoon

Offline contrastive training provides more valuable signals for model performance than traditional supervised fine-tuning methods.
more#machine-learning
#natural-language-processing
fromHackernoon
1 year ago
Miscellaneous

DreamLLM: Additional Experiments That Shed New Light | HackerNoon

DREAMLLM's multimodal adaptation enhances language model performance, setting new benchmarks in natural language processing tasks.
fromHackernoon
3 weeks ago
Roam Research

Detailing the Primary Methodology Implemented in Our Models: Octopus v2 | HackerNoon

The model effectively selects and generates function parameters through a two-stage process involving classification and language modeling.
fromHackernoon
1 year ago
Miscellaneous

DreamLLM: Additional Experiments That Shed New Light | HackerNoon

DREAMLLM's multimodal adaptation enhances language model performance, setting new benchmarks in natural language processing tasks.
fromHackernoon
3 weeks ago
Roam Research

Detailing the Primary Methodology Implemented in Our Models: Octopus v2 | HackerNoon

The model effectively selects and generates function parameters through a two-stage process involving classification and language modeling.
more#natural-language-processing
#ai-safety
fromTechzine Global
2 months ago
Miscellaneous

Anthropic challenges users to jailbreak AI model

Anthropic's Constitutional Classifier aims to prevent AI models from generating responses on sensitive topics, even amidst attempts to bypass these restrictions.
fromTechzine Global
2 months ago
Miscellaneous

Anthropic challenges users to jailbreak AI model

Anthropic's Constitutional Classifier aims to prevent AI models from generating responses on sensitive topics, even amidst attempts to bypass these restrictions.
more#ai-safety
Artificial intelligence
fromHackernoon
2 months ago

What is the Best Way to Train AI Models? | HackerNoon

Fine-tuning models enhances understanding of visual scene structures compared to full-training.
Visual hierarchy decoding in CNNs provides insights into feature representation.
#ai-research
Artificial intelligence
fromArs Technica
2 months ago

DeepSeek goes beyond "open weights" AI with plans for source code release

Open source AI should include training code and data details to meet formal definitions and improve transparency, replicability, and understanding of models.
fromHackernoon
9 months ago
Data science

Textbooks Are All You Need: Abstract and Introduction | HackerNoon

phi-1 is a compact 1.3B parameter language model for code, achieving notable accuracy despite its smaller size.
Artificial intelligence
fromArs Technica
2 months ago

DeepSeek goes beyond "open weights" AI with plans for source code release

Open source AI should include training code and data details to meet formal definitions and improve transparency, replicability, and understanding of models.
fromHackernoon
9 months ago
Data science

Textbooks Are All You Need: Abstract and Introduction | HackerNoon

phi-1 is a compact 1.3B parameter language model for code, achieving notable accuracy despite its smaller size.
more#ai-research
Artificial intelligence
fromInfoWorld
3 months ago

The bitter lesson for generative AI adoption

Relying on retrieval-augmented generation and prompt engineering is a more sustainable strategy compared to constant model training and fine-tuning.
#language-models
fromArs Technica
3 months ago
Science

It's remarkably easy to inject new medical misinformation into LLMs

Misinformation training in models increases overall unreliability in medical content, even from minimal inclusion.
fromHackernoon
1 year ago
Data science

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | HackerNoon

Achieving precise control of unsupervised language models is challenging, particularly when using reinforcement learning from human feedback due to its complexity and instability.
fromArs Technica
3 months ago
Science

It's remarkably easy to inject new medical misinformation into LLMs

Misinformation training in models increases overall unreliability in medical content, even from minimal inclusion.
fromHackernoon
1 year ago
Data science

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | HackerNoon

Achieving precise control of unsupervised language models is challenging, particularly when using reinforcement learning from human feedback due to its complexity and instability.
more#language-models
fromHackernoon
4 months ago
Data science

New Study Shows How Positive-Sum Fairness Impacts Medical AI Models in Chest Radiography | HackerNoon

The study addresses the impact of ethnicity on the prediction of lung lesions using chest radiographs.
It emphasizes the importance of fairness in AI healthcare models across different racial subgroups.
fromHackernoon
1 year ago
Medicine

How AI Learns from Human Preferences | HackerNoon

The RLHF pipeline enhances model effectiveness through three main phases: supervised fine-tuning, preference sampling, and reinforcement learning optimization.
Data science
fromAxios
9 months ago

This is AI's brain on AI

Data from AI models is increasingly used to train other AI models through synthetic data, aiding chatbots but also posing risks of destabilization.
fromInfoQ
9 months ago
Artificial intelligence

OpenAI's CriticGPT Catches Errors in Code Generated by ChatGPT

CriticGPT improves code feedback and bug detection, enhancing model evaluation and training.
[ Load more ]