Marc Sun's picture

Marc Sun

marcsun13

·

AI & ML interests

LLM, Quantization, Training, Inference

Recent Activity

liked a model about 24 hours ago

microsoft/bitnet-b1.58-2B-4T

upvoted an article 6 days ago

Memory-efficient Diffusion Transformers with Quanto and Diffusers

reacted to Wauplin's post with 🤗 8 days ago

‼️ huggingface_hub's v0.30.0 is out with our biggest update of the past two years! Full release notes: https://github.com/huggingface/huggingface_hub/releases/tag/v0.30.0. 🚀 Ready. Xet. Go! Xet is a groundbreaking new protocol for storing large objects in Git repositories, designed to replace Git LFS. Unlike LFS, which deduplicates files, Xet operates at the chunk level—making it a game-changer for AI builders collaborating on massive models and datasets. Our Python integration is powered by [xet-core](https://github.com/huggingface/xet-core), a Rust-based package that handles all the low-level details. You can start using Xet today by installing the optional dependency: ```bash pip install -U huggingface_hub[hf_xet] ``` With that, you can seamlessly download files from Xet-enabled repositories! And don’t worry—everything remains fully backward-compatible if you’re not ready to upgrade yet. Blog post: https://huggingface.co/blog/xet-on-the-hub Docs: https://huggingface.co/docs/hub/en/storage-backends#xet ⚡ Inference Providers - We’re thrilled to introduce Cerebras and Cohere as official inference providers! This expansion strengthens the Hub as the go-to entry point for running inference on open-weight models. - Novita is now our 3rd provider to support text-to-video task after Fal.ai and Replicate. - Centralized billing: manage your budget and set team-wide spending limits for Inference Providers! Available to all Enterprise Hub organizations. ```py from huggingface_hub import InferenceClient client = InferenceClient(provider="fal-ai", bill_to="my-cool-company") image = client.text_to_image( "A majestic lion in a fantasy forest", model="black-forest-labs/FLUX.1-schnell", ) image.save("lion.png") ``` - No more timeouts when generating videos, thanks to async calls. Available right now for Fal.ai, expecting more providers to leverage the same structure very soon!

View all activity

Organizations

Articles 8

Article

52

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

Article

42

Introducing SynthID Text

View all Articles

models 19

marcsun13/Llama-3.1-8B-Instruct-bnb-4bit

Text Generation • Updated 27 days ago • 4

marcsun13/phi-4-bnb-4bit

Text Generation • Updated 29 days ago • 1

marcsun13/Llama-3.2-1B-bnb-4bit

Text Generation • Updated 29 days ago • 3

marcsun13/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8

Updated Feb 13 • 5

marcsun13/paligemma_vqav2

marcsun13/Meta-Llama-3-8B-torchao-int8_weight_only

Updated Oct 18, 2024

marcsun13/sft_openassistant-guanaco

Text Generation • Updated Jul 5, 2024 • 2

marcsun13/gemma-2-27b-it-bnb-colab

Text Generation • Updated Jul 4, 2024 • 1

marcsun13/gemma-2-9b-it-GPTQ

Text Generation • Updated Jul 3, 2024 • 344 • 3

marcsun13/test_push_checkpoint

Fill-Mask • Updated Jun 28, 2024 • 11

datasets

None public yet