The race to lead Artificial Intelligence has never been more competitive. While giants like Google, Microsoft, Anthropic, and OpenAI continue refining their closed models, DeepSeek has made a bold move with the release of DeepSeek V3.1 — a hybrid, fast, open-source model that brings improvements marking a before-and-after in how we work with AI.
This new version combines speed, flexibility, and openness — three key ingredients that make it a true alternative in environments where reducing costs without losing power is critical. This practical guide explains its innovations, how to get started, and concrete examples you can implement right now.
1. What’s New in DeepSeek V3.1?
DeepSeek V3.1 introduces major changes compared to the previous version (V3) and the R1-0528 model, which it even surpasses in reasoning speed. Its main novelties include:
- Hybrid inference (Think / Non-Think):
- Non-Think: fast, direct answers without long reasoning. Ideal for chat, simple writing, and daily tasks.
- Think: deeper, multi-step reasoning with higher accuracy on complex queries. Perfect for programming, advanced calculations, or decision-making.
- More training data:
- V3 trained on 500B tokens.
- V3.1 expanded to 840B tokens.
- Larger context window:
- V3 supported 64K tokens.
- V3.1 doubles to 128K tokens, enabling analysis of long documents, complete code projects, or large datasets in one go.
- Improved agents: better integration with external tools, step-by-step reasoning, and stronger results in benchmarks like SWE and Terminal-Bench.
- More flexible API: two separate endpoints (
deepseek-chat
anddeepseek-reasoner
), Anthropic API format support, and strict function calling in beta. - Open source: model weights are available on Hugging Face, strengthening transparency and community involvement.
- New pricing (from September 5, 2025):
- Input cache hit: $0.07 / 1M tokens.
- Input cache miss: $0.56 / 1M tokens.
- Output tokens: $1.68 / 1M tokens.
2. How to Get Started with DeepSeek V3.1
The easiest way to begin is through the official API. If you’ve used OpenAI or Anthropic APIs, the integration will feel familiar.
Step 1: Get access
- Register on the DeepSeek platform.
- Generate an API Key from your user panel.
- Store the key securely (
.env
file locally or secret managers like Vault in production).
Step 2: Available endpoints
- deepseek-chat → Non-Think mode (fast, direct).
- deepseek-reasoner → Think mode (advanced reasoning).
Both support 128K tokens of context, giving plenty of room to work with large data.
3. First Examples with the API (cURL)
Basic Non-Think Example
curl https://api.deepseek.com/v1/chat/completions \
-H "Authorization: Bearer $DEEPSEEK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a 3-sentence summary about climate change."}
]
}'
Code language: PHP (php)
Think Mode Example
curl https://api.deepseek.com/v1/chat/completions \
-H "Authorization: Bearer $DEEPSEEK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-reasoner",
"messages": [
{"role": "system", "content": "You are a math expert."},
{"role": "user", "content": "Solve step by step: 245 * 37"}
]
}'
Code language: PHP (php)
4. Using Python with the API
import os
import requests
API_KEY = os.getenv("DEEPSEEK_API_KEY")
url = "https://api.deepseek.com/v1/chat/completions"
payload = {
"model": "deepseek-reasoner", # or "deepseek-chat"
"messages": [
{"role": "system", "content": "You are an assistant that helps programmers."},
{"role": "user", "content": "Write a Python script that sorts a list of numbers."}
]
}
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
response = requests.post(url, headers=headers, json=payload)
print(response.json()["choices"][0]["message"]["content"])
Code language: PHP (php)
This will return a Python script that sorts numbers, with V3.1 in Think mode also explaining the reasoning step by step before providing the final answer.
5. Practical Use Cases
a) Quick text writing
Non-Think mode can generate short articles, summaries, or product descriptions in seconds. Perfect for marketing teams or feeding content into websites.
b) Analyzing long documents
With 128K context, it’s now possible to upload full contracts, technical manuals, or plain-text datasets and get consistent analysis without truncation.
c) Programming and debugging
Think mode is ideal for error debugging and code explanations. For instance, upload a 500-line file and ask it to find potential bugs or document functions.
d) Building intelligent agents
With function calling, you can connect DeepSeek to external APIs. Example:
- Connect to a weather API.
- Ask DeepSeek to reason whether holding an outdoor event makes sense.
- Let it automatically call the external function to retrieve data and combine it into the answer.
e) Data analysis and BI
By processing large CSVs as plain text, the model can identify trends, run comparisons, or explain results.
6. Cost Optimization
With the new pricing, it’s important to understand the categories:
- Cache hit ($0.07/1M tokens) → when part of the prompt has been processed before.
- Cache miss ($0.56/1M tokens) → when input is new and must be computed from scratch.
- Output ($1.68/1M tokens) → always charged for generated tokens.
Tips:
- Reuse prompts to leverage cache hits.
- Reduce verbosity in repetitive prompts.
- Use Non-Think for simple tasks to save costs.
- Reserve Think mode for tasks where reasoning is essential.
7. DeepSeek V3.1 vs. Competitors
- Google Gemini / Microsoft Copilot → closed ecosystems, pricier, less flexible.
- Anthropic Claude → strong in ethics and safety, but less API flexibility.
- OpenAI GPT-4.1 → powerful in creativity, but more expensive and closed.
- DeepSeek V3.1 → open source, lower prices, unique hybrid inference.
The balance of speed, cost, and openness is its key advantage over the tech giants.
8. FAQ – Frequently Asked Questions
1. What’s the difference between Think and Non-Think modes in DeepSeek V3.1?
Non-Think mode provides fast answers without detailed reasoning — great for everyday tasks. Think mode develops step-by-step reasoning, recommended for programming, calculations, or complex analysis.
2. What does 128K tokens of context mean?
It means the model can process up to 128,000 tokens in one request — equivalent to hundreds of pages of text. This enables handling long documents without losing coherence.
3. How much does DeepSeek V3.1 cost?
From September 5, 2025:
- $0.07 / 1M tokens (cache hit).
- $0.56 / 1M tokens (cache miss).
- $1.68 / 1M output tokens.
4. Can I use DeepSeek V3.1 in my own projects?
Yes. Model weights are available on Hugging Face under open-source license, so you can integrate it locally, in your own infrastructure, or via the official API.
👉 With this step-by-step guide, DeepSeek V3.1 positions itself as a versatile tool for companies, developers, and users seeking fast, affordable, and flexible AI.
Source: Noticias inteligencia artificial (spanish)