Explore projects

T

季玮晔 / turnkey-mcp

0

Updated Feb 19, 2026

0 0 0 0

Updated Feb 19, 2026
L

季玮晔 / laojun

0

Updated Feb 11, 2026

0 0 0 0

Updated Feb 11, 2026
A

黄云飞 / aiping2official_test

1

Updated Feb 04, 2026

1 0 0 0

Updated Feb 04, 2026
M

陈洋 / mTuner
Apache License 2.0

0

Updated Jan 20, 2026

0 0 0 0

Updated Jan 20, 2026
U

季玮晔 / ultraattn
Apache License 2.0

0

Updated Dec 19, 2025

0 0 0 0

Updated Dec 19, 2025
T

宁争胜 / tops_gemv_w4a16

0

Updated Nov 21, 2025

0 0 1 0

Updated Nov 21, 2025
V

黄云飞 / vllm
Apache License 2.0

0

Updated Aug 01, 2025

0 0 0 0

Updated Aug 01, 2025
O

林锐 / oneparser-dify-plugin

0

Updated Jul 30, 2025

0 0 0 0

Updated Jul 30, 2025
C

黄云飞 / ComfyUI

0

Updated Jul 29, 2025

0 0 0 0

Updated Jul 29, 2025
L

黄云飞 / llm_deploy

0

Updated Jul 28, 2025

0 0 0 0

Updated Jul 28, 2025
G

萧翰 / grouped_matmul_antiquant

0

Updated Jul 25, 2025

0 0 0 0

Updated Jul 25, 2025
R

林锐 / rag_eval

0

Updated Jul 14, 2025

0 0 0 0

Updated Jul 14, 2025
O

李拯先 / omniserve
Apache License 2.0

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

0

Updated Jul 12, 2025

0 0 0 0

Updated Jul 12, 2025
K

李拯先 / KernelBench
MIT License

KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems

0

Updated Apr 18, 2025

0 0 0 0

Updated Apr 18, 2025
L

张运宸 / llmc

llmc is an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.

0

Updated Apr 17, 2025

0 0 0 0

Updated Apr 17, 2025
S

董西淼 / sd_embed
Apache License 2.0

0

Updated Apr 10, 2025

0 0 0 0

Updated Apr 10, 2025
I

董西淼 / image-text-metadata-converter

0

Updated Mar 28, 2025

0 0 0 0

Updated Mar 28, 2025
A

lijian / AutoAWQ
MIT License

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

0

Updated Mar 12, 2025

0 0 0 0

Updated Mar 12, 2025
V

李拯先 / vllm
Apache License 2.0

A high-throughput and memory-efficient inference and serving engine for LLMs

0

Updated Mar 04, 2025

0 0 0 0

Updated Mar 04, 2025
F

唐适之 / fairscale
BSD 3-Clause "New" or "Revised" License

PyTorch extensions for high performance and large scale training.

0

Updated Feb 05, 2025

0 0 0 0

Updated Feb 05, 2025