主流AI框架与工具链
AI开发必备的框架、工具和技术栈完整指南
📋 概述
本文全面介绍AI开发中常用的框架、工具和技术栈,包括深度学习框架、数据处理工具、模型部署方案等。无论你是初学者还是经验丰富的开发者,都能从中找到适合自己的工具组合。
第一部分:深度学习框架
1. PyTorch
概述
- 开发者:Facebook AI Research (FAIR)
- 首次发布:2016年
- 特点:动态计算图、Pythonic、易于调试
- 适用场景:研究、快速原型开发、生产部署
核心优势
1. 动态计算图
import torch
# 动态构建计算图
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = x ** 2
z = y.sum()
# 反向传播
z.backward()
print(x.grad) # [2., 4., 6.]
# 每次前向传播都可以不同
for i in range(3):
y = x ** (i + 1)
z = y.sum()
print(f"Iteration {i}: {z.item()}")
2. Pythonic设计
import torch.nn as nn
# 定义模型非常直观
class MyModel(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 10)
self.relu = nn.ReLU()
def forward(self, x):
x = self.relu(self.fc1(x))
x = self.fc2(x)
return x
model = MyModel()
3. 强大的生态系统
- torchvision:计算机视觉
- torchaudio:音频处理
- torchtext:自然语言处理
- PyTorch Lightning:简化训练流程
- HuggingFace Transformers:预训练模型
快速入门
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
# 1. 准备数据
X = torch.randn(1000, 20)
y = torch.randint(0, 2, (1000,))
dataset = TensorDataset(X, y)
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
# 2. 定义模型
model = nn.Sequential(
nn.Linear(20, 64),
nn.ReLU(),
nn.Linear(64, 32),
nn.ReLU(),
nn.Linear(32, 2)
)
# 3. 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# 4. 训练循环
for epoch in range(10):
for batch_X, batch_y in dataloader:
# 前向传播
outputs = model(batch_X)
loss = criterion(outputs, batch_y)
# 反向传播
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(f'Epoch {epoch+1}, Loss: {loss.item():.4f}')
# 5. 保存模型
torch.save(model.state_dict(), 'model.pth')
# 6. 加载模型
model.load_state_dict(torch.load('model.pth'))
高级特性
自动混合精度(AMP)
from torch.cuda.amp import autocast, GradScaler
scaler = GradScaler()
for epoch in range(epochs):
for data, target in dataloader:
optimizer.zero_grad()
# 使用自动混合精度
with autocast():
output = model(data)
loss = criterion(output, target)
# 缩放损失并反向传播
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
分布式训练
import torch.distributed as dist
from torch.nn.parallel import DistributedDataParallel as DDP
# 初始化进程组
dist.init_process_group(backend='nccl')
# 包装模型
model = DDP(model, device_ids=[local_rank])
# 训练(代码与单GPU相同)
学习资源
- 📚 官方教程:https://pytorch.org/tutorials/
- 📺 PyTorch官方YouTube频道
- 📖 《Deep Learning with PyTorch》
- 🎓 Fast.ai课程(基于PyTorch)
2. TensorFlow / Keras
概述
- 开发者:Google Brain
- 首次发布:2015年(TensorFlow)、2015年(Keras)
- 特点:静态计算图、完整生态、生产部署
- 适用场景:工业应用、大规模部署、移动端
核心优势
1. 完整的生产生态
- TensorFlow Serving:模型服务
- TensorFlow Lite:移动端部署
- TensorFlow.js:浏览器端运行
- TensorFlow Extended (TFX):端到端ML平台
2. Keras高级API
import tensorflow as tf
from tensorflow import keras
# 简洁的模型定义
model = keras.Sequential([
keras.layers.Dense(64, activation='relu', input_shape=(20,)),
keras.layers.Dropout(0.5),
keras.layers.Dense(32, activation='relu'),
keras.layers.Dense(2, activation='softmax')
])
# 编译模型
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# 训练(一行代码)
history = model.fit(X_train, y_train,
epochs=10,
validation_data=(X_val, y_val),
batch_size=32)
# 评估
test_loss, test_acc = model.evaluate(X_test, y_test)
3. TensorBoard可视化
# 添加回调
tensorboard_callback = keras.callbacks.TensorBoard(log_dir='./logs')
model.fit(X_train, y_train,
epochs=10,
callbacks=[tensorboard_callback])
# 启动TensorBoard
# tensorboard --logdir=./logs
快速入门
import tensorflow as tf
from tensorflow import keras
import numpy as np
# 1. 加载数据
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
# 2. 数据预处理
X_train = X_train.reshape(-1, 784).astype('float32') / 255
X_test = X_test.reshape(-1, 784).astype('float32') / 255
# 3. 定义模型(Sequential API)
model = keras.Sequential([
keras.layers.Dense(128, activation='relu', input_shape=(784,)),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activation='softmax')
])
# 4. 编译模型
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# 5. 训练
history = model.fit(
X_train, y_train,
epochs=5,
batch_size=32,
validation_split=0.2,
verbose=1
)
# 6. 评估
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)
print(f'Test accuracy: {test_acc:.4f}')
# 7. 保存模型
model.save('my_model.h5')
# 8. 加载模型
loaded_model = keras.models.load_model('my_model.h5')
函数式API(更灵活)
from tensorflow import keras
# 输入层
inputs = keras.Input(shape=(784,))
# 隐藏层
x = keras.layers.Dense(128, activation='relu')(inputs)
x = keras.layers.Dropout(0.2)(x)
x = keras.layers.Dense(64, activation='relu')(x)
# 输出层
outputs = keras.layers.Dense(10, activation='softmax')(x)
# 创建模型
model = keras.Model(inputs=inputs, outputs=outputs)
# 编译和训练(同上)
自定义组件
# 自定义层
class MyDenseLayer(keras.layers.Layer):
def __init__(self, units=32):
super(MyDenseLayer, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(
shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True
)
self.b = self.add_weight(
shape=(self.units,),
initializer='zeros',
trainable=True
)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
# 自定义训练循环
@tf.function
def train_step(x, y):
with tf.GradientTape() as tape:
predictions = model(x, training=True)
loss = loss_fn(y, predictions)
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
return loss
学习资源
- 📚 官方教程:https://www.tensorflow.org/tutorials
- 📺 TensorFlow YouTube频道
- 📖 《Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow》
- 🎓 Coursera TensorFlow专项课程
3. PyTorch vs TensorFlow 对比
| 特性 | PyTorch | TensorFlow/Keras |
|---|---|---|
| 学习曲线 | 较平缓,Pythonic | Keras简单,TF较陡 |
| 调试 | 容易(动态图) | 较难(静态图) |
| 研究 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| 生产部署 | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| 移动端 | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| 社区 | 学术界主流 | 工业界主流 |
| 文档 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| 性能 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
选择建议:
- 🎓 学习/研究:PyTorch(更直观)
- 🏢 工业部署:TensorFlow(生态完整)
- 📱 移动端:TensorFlow Lite
- 🌐 Web端:TensorFlow.js
- 🤝 团队技术栈:跟随团队选择
第二部分:数据处理工具
1. NumPy
核心功能
import numpy as np
# 数组创建
a = np.array([1, 2, 3, 4, 5])
b = np.zeros((3, 4))
c = np.ones((2, 3))
d = np.random.randn(3, 3)
# 数组运算
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
print(x + y) # [5 7 9]
print(x * y) # [4 10 18]
print(x @ y) # 32 (点积)
# 广播
a = np.array([[1, 2, 3],
[4, 5, 6]])
b = np.array([10, 20, 30])
print(a + b) # 广播相加
# 索引和切片
arr = np.arange(10)
print(arr[2:5]) # [2 3 4]
print(arr[arr > 5]) # [6 7 8 9]
# 形状操作
arr = np.arange(12)
print(arr.reshape(3, 4))
print(arr.reshape(2, 2, 3))
# 统计函数
data = np.random.randn(100)
print(f"均值: {np.mean(data)}")
print(f"标准差: {np.std(data)}")
print(f"最大值: {np.max(data)}")
print(f"最小值: {np.min(data)}")
# 线性代数
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print(A @ B) # 矩阵乘法
print(np.linalg.inv(A)) # 逆矩阵
print(np.linalg.det(A)) # 行列式
2. Pandas
核心功能
import pandas as pd
import numpy as np
# 创建DataFrame
df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8],
'C': ['a', 'b', 'c', 'd']
})
# 读取数据
df = pd.read_csv('data.csv')
df = pd.read_excel('data.xlsx')
df = pd.read_json('data.json')
# 数据查看
print(df.head())
print(df.info())
print(df.describe())
# 数据选择
print(df['A']) # 选择列
print(df[['A', 'B']]) # 选择多列
print(df[df['A'] > 2]) # 条件过滤
print(df.loc[0]) # 按标签选择行
print(df.iloc[0]) # 按位置选择行
# 数据清洗
df.dropna() # 删除缺失值
df.fillna(0) # 填充缺失值
df.drop_duplicates() # 删除重复行
# 数据转换
df['D'] = df['A'] + df['B'] # 添加新列
df['E'] = df['A'].apply(lambda x: x ** 2) # 应用函数
# 分组聚合
grouped = df.groupby('C')['A'].mean()
# 合并数据
df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'value': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['A', 'B', 'D'], 'value': [4, 5, 6]})
merged = pd.merge(df1, df2, on='key', how='inner')
# 保存数据
df.to_csv('output.csv', index=False)
df.to_excel('output.xlsx', index=False)
3. Scikit-learn
核心功能
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
# 1. 加载数据
iris = load_iris()
X, y = iris.data, iris.target
# 2. 数据划分
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 3. 特征缩放
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# 4. 训练模型
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train_scaled, y_train)
# 5. 预测
y_pred = model.predict(X_test_scaled)
# 6. 评估
accuracy = accuracy_score(y_test, y_pred)
print(f"准确率: {accuracy:.4f}")
print(classification_report(y_test, y_pred))
# 7. 交叉验证
cv_scores = cross_val_score(model, X, y, cv=5)
print(f"交叉验证分数: {cv_scores.mean():.4f} (+/- {cv_scores.std():.4f})")
# 8. 超参数调优
from sklearn.model_selection import GridSearchCV
param_grid = {
'n_estimators': [50, 100, 200],
'max_depth': [None, 10, 20, 30]
}
grid_search = GridSearchCV(
RandomForestClassifier(random_state=42),
param_grid,
cv=5,
scoring='accuracy'
)
grid_search.fit(X_train_scaled, y_train)
print(f"最佳参数: {grid_search.best_params_}")
print(f"最佳分数: {grid_search.best_score_:.4f}")
第三部分:LLM应用框架
1. LangChain
概述
- 用途:构建LLM应用的框架
- 特点:模块化、链式调用、丰富的集成
- 适用场景:RAG、Agent、对话系统
核心概念
1. LLM集成
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
# 普通LLM
llm = OpenAI(temperature=0.7)
response = llm("What is AI?")
# 聊天模型
chat = ChatOpenAI(temperature=0.7)
from langchain.schema import HumanMessage, SystemMessage
messages = [
SystemMessage(content="You are a helpful assistant."),
HumanMessage(content="What is AI?")
]
response = chat(messages)
2. Prompt模板
from langchain.prompts import PromptTemplate, ChatPromptTemplate
# 简单模板
template = "Tell me a {adjective} joke about {content}."
prompt = PromptTemplate(
input_variables=["adjective", "content"],
template=template
)
print(prompt.format(adjective="funny", content="programming"))
# 聊天模板
chat_template = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant that translates {input_language} to {output_language}."),
("human", "{text}")
])
messages = chat_template.format_messages(
input_language="English",
output_language="Chinese",
text="Hello, how are you?"
)
3. 链(Chains)
from langchain.chains import LLMChain, SimpleSequentialChain
# 单个链
llm = OpenAI(temperature=0.7)
prompt = PromptTemplate(
input_variables=["product"],
template="What is a good name for a company that makes {product}?"
)
chain = LLMChain(llm=llm, prompt=prompt)
result = chain.run("AI-powered chatbots")
# 顺序链
first_prompt = PromptTemplate(
input_variables=["product"],
template="What is a good name for a company that makes {product}?"
)
first_chain = LLMChain(llm=llm, prompt=first_prompt)
second_prompt = PromptTemplate(
input_variables=["company_name"],
template="Write a catchphrase for {company_name}"
)
second_chain = LLMChain(llm=llm, prompt=second_prompt)
overall_chain = SimpleSequentialChain(
chains=[first_chain, second_chain],
verbose=True
)
result = overall_chain.run("AI-powered chatbots")
4. RAG(检索增强生成)
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
# 1. 加载文档
loader = TextLoader('document.txt')
documents = loader.load()
# 2. 分割文档
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
# 3. 创建向量存储
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(texts, embeddings)
# 4. 创建检索QA链
qa = RetrievalQA.from_chain_type(
llm=OpenAI(),
chain_type="stuff",
retriever=vectorstore.as_retriever()
)
# 5. 查询
query = "What is the main topic of the document?"
result = qa.run(query)
print(result)
5. Agent
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.llms import OpenAI
# 加载LLM
llm = OpenAI(temperature=0)
# 加载工具
tools = load_tools(["serpapi", "llm-math"], llm=llm)
# 初始化Agent
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)
# 运行Agent
agent.run("What is the population of Beijing? What is that number raised to the power of 2?")
学习资源
- 📚 官方文档:https://python.langchain.com/
- 📺 LangChain YouTube教程
- 🎓 DeepLearning.AI LangChain课程
2. LlamaIndex
概述
- 用途:数据索引和检索框架
- 特点:专注于RAG、多种索引类型
- 适用场景:知识库问答、文档检索
核心功能
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.llms import OpenAI
# 1. 加载文档
documents = SimpleDirectoryReader('data').load_data()
# 2. 创建索引
index = VectorStoreIndex.from_documents(documents)
# 3. 查询
query_engine = index.as_query_engine()
response = query_engine.query("What is the main topic?")
print(response)
# 4. 自定义LLM
llm = OpenAI(model="gpt-4", temperature=0)
service_context = ServiceContext.from_defaults(llm=llm)
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
# 5. 聊天模式
chat_engine = index.as_chat_engine()
response = chat_engine.chat("Tell me about the document")
print(response)
3. HuggingFace Transformers
概述
- 用途:预训练模型库
- 特点:海量模型、统一API、易于使用
- 适用场景:NLP、CV、多模态任务
核心功能
1. 文本生成
from transformers import pipeline
# 创建生成器
generator = pipeline('text-generation', model='gpt2')
# 生成文本
result = generator("Once upon a time", max_length=50, num_return_sequences=1)
print(result[0]['generated_text'])
2. 文本分类
from transformers import pipeline
# 情感分析
classifier = pipeline('sentiment-analysis')
result = classifier("I love this product!")
print(result)
# 零样本分类
classifier = pipeline('zero-shot-classification')
result = classifier(
"This is a course about Python programming",
candidate_labels=["education", "politics", "business"]
)
print(result)
3. 问答
from transformers import pipeline
qa = pipeline('question-answering')
context = """
The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France.
It is named after the engineer Gustave Eiffel, whose company designed and built the tower.
"""
question = "Who designed the Eiffel Tower?"
result = qa(question=question, context=context)
print(result['answer'])
4. 微调模型
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset
# 1. 加载数据
dataset = load_dataset("imdb")
# 2. 加载模型和分词器
model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
# 3. 数据预处理
def tokenize_function(examples):
return tokenizer(examples["text"], padding="max_length", truncation=True)
tokenized_datasets = dataset.map(tokenize_function, batched=True)
# 4. 训练参数
training_args = TrainingArguments(
output_dir="./results",
num_train_epochs=3,
per_device_train_batch_size=16,
per_device_eval_batch_size=64,
warmup_steps=500,
weight_decay=0.01,
logging_dir='./logs',
)
# 5. 训练
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["test"]
)
trainer.train()
第四部分:模型部署工具
1. FastAPI
概述
- 用途:构建高性能API
- 特点:快速、现代、自动文档
- 适用场景:模型服务、Web API
快速入门
from fastapi import FastAPI
from pydantic import BaseModel
import torch
from transformers import pipeline
# 创建FastAPI应用
app = FastAPI()
# 加载模型
classifier = pipeline('sentiment-analysis')
# 定义请求模型
class TextRequest(BaseModel):
text: str
# 定义响应模型
class PredictionResponse(BaseModel):
label: str
score: float
# 健康检查端点
@app.get("/")
def read_root():
return {"message": "Model API is running"}
# 预测端点
@app.post("/predict", response_model=PredictionResponse)
def predict(request: TextRequest):
result = classifier(request.text)[0]
return PredictionResponse(
label=result['label'],
score=result['score']
)
# 批量预测
@app.post("/predict_batch")
def predict_batch(requests: list[TextRequest]):
texts = [req.text for req in requests]
results = classifier(texts)
return results
# 运行服务器
# uvicorn main:app --reload
使用API
import requests
# 单个预测
response = requests.post(
"http://localhost:8000/predict",
json={"text": "I love this product!"}
)
print(response.json())
# 批量预测
response = requests.post(
"http://localhost:8000/predict_batch",
json=[
{"text": "Great product!"},
{"text": "Terrible experience."}
]
)
print(response.json())
2. Streamlit
概述
- 用途:快速构建数据应用
- 特点:纯Python、无需前端知识
- 适用场景:原型演示、内部工具
快速入门
import streamlit as st
import pandas as pd
import numpy as np
from transformers import pipeline
# 标题
st.title('情感分析应用')
# 侧边栏
st.sidebar.header('设置')
model_name = st.sidebar.selectbox(
'选择模型',
['distilbert-base-uncased-finetuned-sst-2-english', 'bert-base-multilingual-uncased-sentiment']
)
# 加载模型(使用缓存)
@st.cache_resource
def load_model(model_name):
return pipeline('sentiment-analysis', model=model_name)
classifier = load_model(model_name)
# 文本输入
text = st.text_area('输入文本', '这个产品真不错!')
# 预测按钮
if st.button('分析情感'):
with st.spinner('分析中...'):
result = classifier(text)[0]
# 显示结果
st.success('分析完成!')
col1, col2 = st.columns(2)
with col1:
st.metric('情感', result['label'])
with col2:
st.metric('置信度', f"{result['score']:.2%}")
# 可视化
st.progress(result['score'])
# 批量分析
st.header('批量分析')
uploaded_file = st.file_uploader('上传CSV文件', type=['csv'])
if uploaded_file is not None:
df = pd.read_csv(uploaded_file)
st.write('数据预览:')
st.dataframe(df.head())
if st.button('开始批量分析'):
with st.spinner('分析中...'):
results = classifier(df['text'].tolist())
df['sentiment'] = [r['label'] for r in results]
df['confidence'] = [r['score'] for r in results]
st.success('分析完成!')
st.dataframe(df)
# 下载结果
csv = df.to_csv(index=False)
st.download_button(
label='下载结果',
data=csv,
file_name='sentiment_results.csv',
mime='text/csv'
)
# 运行:streamlit run app.py
3. Gradio
概述
- 用途:快速创建ML演示界面
- 特点:简单易用、自动生成UI
- 适用场景:模型演示、快速原型
快速入门
import gradio as gr
from transformers import pipeline
# 加载模型
classifier = pipeline('sentiment-analysis')
# 定义预测函数
def predict_sentiment(text):
result = classifier(text)[0]
return {result['label']: result['score']}
# 创建界面
demo = gr.Interface(
fn=predict_sentiment,
inputs=gr.Textbox(lines=3, placeholder="输入文本..."),
outputs=gr.Label(num_top_classes=2),
title="情感分析",
description="输入文本,分析情感倾向",
examples=[
["这个产品真不错!"],
["太糟糕了,完全不推荐。"],
["还行吧,没什么特别的。"]
]
)
# 启动
demo.launch()
更复杂的示例
import gradio as gr
from PIL import Image
import torch
from torchvision import transforms, models
# 加载模型
model = models.resnet50(pretrained=True)
model.eval()
# 预处理
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# 类别标签
with open("imagenet_classes.txt") as f:
labels = [line.strip() for line in f.readlines()]
def classify_image(image):
# 预处理
input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0)
# 预测
with torch.no_grad():
output = model(input_batch)
# 获取top 5
probabilities = torch.nn.functional.softmax(output[0], dim=0)
top5_prob, top5_catid = torch.topk(probabilities, 5)
# 格式化结果
results = {}
for i in range(5):
results[labels[top5_catid[i]]] = float(top5_prob[i])
return results
# 创建界面
demo = gr.Interface(
fn=classify_image,
inputs=gr.Image(type="pil"),
outputs=gr.Label(num_top_classes=5),
title="图像分类",
description="上传图片进行分类",
examples=["cat.jpg", "dog.jpg", "car.jpg"]
)
demo.launch()
第五部分:实验管理工具
1. Weights & Biases (W&B)
概述
- 用途:实验跟踪、可视化、协作
- 特点:自动记录、丰富可视化、团队协作
- 适用场景:研究、生产、团队项目
快速入门
import wandb
import torch
import torch.nn as nn
import torch.optim as optim
# 1. 初始化
wandb.init(
project="my-project",
config={
"learning_rate": 0.001,
"epochs": 10,
"batch_size": 32,
"architecture": "CNN"
}
)
# 2. 访问配置
config = wandb.config
# 3. 定义模型
model = nn.Sequential(
nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10)
)
optimizer = optim.Adam(model.parameters(), lr=config.learning_rate)
criterion = nn.CrossEntropyLoss()
# 4. 训练循环
for epoch in range(config.epochs):
for batch_idx, (data, target) in enumerate(train_loader):
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# 记录指标
wandb.log({
"loss": loss.item(),
"epoch": epoch
})
# 验证
val_loss, val_acc = validate(model, val_loader)
# 记录验证指标
wandb.log({
"val_loss": val_loss,
"val_acc": val_acc,
"epoch": epoch
})
# 保存最佳模型
if val_acc > best_acc:
torch.save(model.state_dict(), 'best_model.pth')
wandb.save('best_model.pth')
# 5. 记录图片
wandb.log({"examples": [wandb.Image(img) for img in sample_images]})
# 6. 记录表格
wandb.log({"predictions": wandb.Table(dataframe=predictions_df)})
# 7. 结束
wandb.finish()
2. TensorBoard
概述
- 用途:TensorFlow官方可视化工具
- 特点:实时监控、丰富图表
- 适用场景:TensorFlow/PyTorch训练
PyTorch集成
from torch.utils.tensorboard import SummaryWriter
import torch
import torch.nn as nn
# 创建writer
writer = SummaryWriter('runs/experiment_1')
# 记录标量
for epoch in range(10):
loss = train_epoch()
writer.add_scalar('Loss/train', loss, epoch)
val_loss = validate()
writer.add_scalar('Loss/val', val_loss, epoch)
# 记录多个标量
writer.add_scalars('Loss', {
'train': train_loss,
'val': val_loss
}, epoch)
# 记录图片
writer.add_image('input_image', img_tensor, epoch)
# 记录模型图
writer.add_graph(model, input_tensor)
# 记录直方图
writer.add_histogram('fc1.weight', model.fc1.weight, epoch)
# 记录文本
writer.add_text('config', str(config), epoch)
# 关闭writer
writer.close()
# 启动TensorBoard
# tensorboard --logdir=runs
第六部分:工具选择建议
按场景选择
研究和实验
- 深度学习框架:PyTorch
- 实验管理:Weights & Biases
- 快速原型:Streamlit / Gradio
生产部署
- 深度学习框架:TensorFlow / PyTorch
- API服务:FastAPI
- 模型服务:TensorFlow Serving / TorchServe
- 容器化:Docker + Kubernetes
LLM应用
- 应用框架:LangChain
- 向量数据库:Pinecone / Milvus
- 模型:HuggingFace Transformers
数据科学
- 数据处理:Pandas + NumPy
- 机器学习:Scikit-learn
- 可视化:Matplotlib + Seaborn
学习路径
第一阶段(1-2个月)
- NumPy + Pandas(数据处理)
- Scikit-learn(机器学习)
- Matplotlib(可视化)
第二阶段(2-3个月)
- PyTorch 或 TensorFlow(深度学习)
- TensorBoard(实验监控)
- Jupyter Notebook(交互式开发)
第三阶段(3-6个月)
- LangChain(LLM应用)
- HuggingFace Transformers(预训练模型)
- FastAPI(模型部署)
- Streamlit / Gradio(快速演示)
第四阶段(6个月+)
- Docker + Kubernetes(容器化部署)
- Weights & Biases(团队协作)
- 云平台(AWS / Azure / GCP)
📚 学习资源
官方文档
- PyTorch:https://pytorch.org/docs/
- TensorFlow:https://www.tensorflow.org/
- LangChain:https://python.langchain.com/
- HuggingFace:https://huggingface.co/docs
在线课程
- Fast.ai(PyTorch实战)
- DeepLearning.AI(TensorFlow专项)
- Coursera(各类专项课程)
社区资源
- GitHub(开源项目)
- Stack Overflow(问题解答)
- Reddit(r/MachineLearning)
- Discord/Slack(AI社区)
🔗 相关文章
最后更新:2024年12月22日