苏州知名网站制作开发,咨询类公司注册需要什么,wordpress 云笔记,福田区网站建【视频生成大模型】 视频生成大模型 THUDM/CogVideoX-2b CogVideoX-2b 模型介绍发布时间模型测试生成的demo视频生成视频限制 运行环境安装运行模型下载开源协议参考 CogVideoX-2b 模型介绍
CogVideoX是 清影 同源的开源版本视频生成模型。
基础信息#xff1a; 发布时间
2… 【视频生成大模型】 视频生成大模型 THUDM/CogVideoX-2b CogVideoX-2b 模型介绍发布时间模型测试生成的demo视频生成视频限制 运行环境安装运行模型下载开源协议参考 CogVideoX-2b 模型介绍
CogVideoX是 清影 同源的开源版本视频生成模型。
基础信息 发布时间
2024年8月份
模型测试生成的demo视频
https://github.com/THUDM/CogVideo/raw/main/resources/videos/1.mp4
https://github.com/THUDM/CogVideo/raw/main/resources/videos/2.mp4
生成视频限制
提示词语言 English*提示词长度上限 226 Tokens视频长度 6 秒帧率 8 帧 / 秒视频分辨率 720 * 480不支持其他分辨率(含微调)
运行环境安装
# diffusers0.30.1
# transformers0.44.0
# accelerate0.33.0 (suggest install from source)
# imageio-ffmpeg0.5.1
pip install --upgrade transformers accelerate diffusers imageio-ffmpeg 运行模型
import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_videoprompt A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The pandas fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The pandas face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance.pipe CogVideoXPipeline.from_pretrained(THUDM/CogVideoX-2b,torch_dtypetorch.float16
)pipe.enable_model_cpu_offload()
pipe.enable_sequential_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
video pipe(promptprompt,num_videos_per_prompt1,num_inference_steps50,num_frames49,guidance_scale6,generatortorch.Generator(devicecuda).manual_seed(42),
).frames[0]export_to_video(video, output.mp4, fps8)Quantized Inference
PytorchAO 和 Optimum-quanto 可以用于对文本编码器、Transformer 和 VAE 模块进行量化从而降低 CogVideoX 的内存需求。这使得在免费的 T4 Colab 或较小 VRAM 的 GPU 上运行该模型成为可能值得注意的是TorchAO 量化与 torch.compile 完全兼容这可以显著加快推理速度。
# To get started, PytorchAO needs to be installed from the GitHub source and PyTorch Nightly.
# Source and nightly installation is only required until next release.import torch
from diffusers import AutoencoderKLCogVideoX, CogVideoXTransformer3DModel, CogVideoXPipeline
from diffusers.utils import export_to_video
from transformers import T5EncoderModel
from torchao.quantization import quantize_, int8_weight_only, int8_dynamic_activation_int8_weightquantization int8_weight_onlytext_encoder T5EncoderModel.from_pretrained(THUDM/CogVideoX-2b, subfoldertext_encoder, torch_dtypetorch.bfloat16)
quantize_(text_encoder, quantization())transformer CogVideoXTransformer3DModel.from_pretrained(THUDM/CogVideoX-5b, subfoldertransformer, torch_dtypetorch.bfloat16)
quantize_(transformer, quantization())vae AutoencoderKLCogVideoX.from_pretrained(THUDM/CogVideoX-2b, subfoldervae, torch_dtypetorch.bfloat16)
quantize_(vae, quantization())# Create pipeline and run inference
pipe CogVideoXPipeline.from_pretrained(THUDM/CogVideoX-2b,text_encodertext_encoder,transformertransformer,vaevae,torch_dtypetorch.bfloat16,
)
pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()# prompt 只能输入英文
prompt A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The pandas fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The pandas face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance.video pipe(promptprompt,num_videos_per_prompt1,num_inference_steps50,num_frames49,guidance_scale6,generatortorch.Generator(devicecuda).manual_seed(42),
).frames[0]export_to_video(video, output.mp4, fps8)下载
model_id: THUDM/CogVideoX-2b 下载地址https://hf-mirror.com/THUDM/CogVideoX-2b 不需要翻墙
开源协议
License: apache-2.0
参考
https://hf-mirror.com/THUDM/CogVideoX-2bhttps://github.com/THUDM/CogVideo