Topic

#open-weight-models

16 articles tagged open-weight-models. Browse the full set below, or see all topics.

Tagged "open-weight-models"

Cross-cutting reads on this topic

16 articles

AI Development

Do Not Single-Source Your AI: A Second-Source Playbook

The Fable 5 export shutdown showed single-vendor AI can halt your business overnight. A four-step second-source playbook with open-weight failover backups.

#AI vendor resilience#open-weight models+5 more

2026-06-21

Read Article

AI Development

Google DiffusionGemma: First Open-Weight Text Diffusion

DiffusionGemma is Google's first open-weight text diffusion LLM: a 26B MoE under Apache 2.0 hitting 1,100+ tokens/sec on one H100. Where it wins and loses.

#diffusiongemma#google-deepmind+5 more

2026-06-13

Read Article

AI Development

OpenRouter June 2026: New Models, Pricing and Rankings

OpenRouter added five models in ten days, from Opus 4.8 to MiniMax M3 at $0.30/M input. June 2026 pricing, context windows, and usage rankings.

#openrouter#minimax-m3+6 more

2026-06-04

Read Article

Business

AI Build vs Buy in 2026: A Decision Framework for Agencies

Open-weight models now run 10-12x cheaper than frontier SaaS. A 2026 framework on the TCO crossover, vendor lock-in, and data sovereignty for agencies.

#build-vs-buy#ai-strategy+6 more

2026-06-04

Read Article

AI Development

MiniMax M3 vs Opus 4.8 vs GPT-5.5: Coding Showdown

MiniMax M3 lands at 5-17x lower cost, but Opus 4.8 leads SWE-bench Pro and GPT-5.5 wins Terminal-Bench. A full three-way agentic coding routing matrix.

#minimax-m3#claude-opus-4-8+6 more

2026-06-03

Read Article

Business

DeepSeek's First Raise: ~$7.4B and Open-Weight Stakes

DeepSeek abandons its no-outside-capital stance in a ~$7.4B maiden round led by Tencent and CATL, valuing it near $59B and reshaping open-weight economics.

#deepseek#ai-funding+6 more

2026-06-03

Read Article

AI Development

NVIDIA Cosmos 3: Open Physical-AI Omnimodel Guide 2026

Cosmos 3 is the first fully open physical-AI omnimodel: one model reasons, simulates, and predicts robot actions. Inside the two-tower design and how to run it.

#nvidia-cosmos-3#physical-ai+6 more

2026-06-01

Read Article

AI Development

MiniMax M3 Release: 1M-Context Agentic Frontier Model

MiniMax M3 fuses frontier coding, a 1M-token context window, and native multimodality. Inside its Sparse Attention design, vendor benchmarks, and pricing.

#minimax-m3#open-weight-models+6 more

2026-05-31

Read Article

AI Development

StepFun Step 3.7 Flash: 196B MoE Agentic Vision Model

StepFun's Apache-2.0 Step 3.7 Flash pairs a 196B MoE backbone with a 1.8B vision encoder, activating ~11B params per token. The cost case for agentic teams.

#stepfun-step-3-7-flash#mixture-of-experts+6 more

2026-05-30

Read Article

AI Development

Self-Hosting Open-Weight LLMs: 2026 Decision Guide

When self-hosting open-weight models beats API calls: a cost-crossover model, GPU sizing tables, and a deployment matrix for vLLM, SGLang, and Ollama.

#self-hosting-llm#open-weight-models+6 more

2026-05-27

Read Article

AI Development

DeepSeek V3.2 to V4 Migration Playbook: Open-Weight Stack

Migrate DeepSeek V3.2 to V4 across open-weight stacks — three reasoning modes, tokenizer change, HCA/CSA attention deltas, KV-cache reduction.

#deepseek-v4#migration-playbook+7 more

2026-05-05

Read Article

AI Development

Self-Hosting Frontier AI Models: 2026 TCO Analysis

GPU spend, ops headcount, latency, and break-even volume for hosting Llama, Qwen, DeepSeek, and Mistral yourself vs API. With per-token cost curves at 4 scales.

#self-hosting-llm#ai-tco+8 more

2026-04-24

Read Article

AI Development

Quantization Tradeoffs: 4-bit vs 8-bit vs FP8 Data

Cross-model quality regression, throughput lift, and VRAM savings at GPTQ-4, AWQ-4, INT8, and FP8 — benchmark data across 6 open-weight models.

#quantization#gptq+8 more

2026-04-24

Read Article

AI Development

AI Inference Providers Compared: Q2 2026 Pricing Matrix

Seven serverless inference providers compared on price, latency, model availability, and throughput. 60+ data points across 12 popular models.

#ai-inference-providers#together-ai+8 more

2026-04-24

Read Article

AI Development

AI Model API Pricing Tracker Q2 2026: 200 Data Points

Side-by-side input, output, cached, and batch pricing for 30 frontier and open-weight models across 12 providers. Updated April 2026 with 200+ price points.

#ai-model-pricing#llm-pricing+8 more

2026-04-23

Read Article

AI Development

Open-Weight vs Closed-Source AI Models 2026: Gap Analysis

Q2 2026 gap analysis between open-weight and closed-source frontier models — capability parity, cost economics, and the agency deployment decision tree.

#open-weight-models#closed-source-models+4 more

2026-04-12

Read Article