Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: huggingface/text-generation-inference
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v3.3.2
Choose a base ref
...
head repository: huggingface/text-generation-inference
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v3.3.3
Choose a head ref
  • 14 commits
  • 97 files changed
  • 5 contributors

Commits on Jun 3, 2025

  1. Remove useless packages (#3253)

    Signed-off-by: yuanwu <yuan.wu@intel.com>
    yuanwu2017 authored Jun 3, 2025
    Configuration menu
    Copy the full SHA
    1ff9d18 View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2025

  1. Bump neuron SDK version (#3260)

    * chore(neuron): bump version to 0.2.0
    
    * refactor(neuron): use named parameters in inputs helpers
    
    This allows to hide the differences between the two backends in terms of
    input parameters.
    
    * refactor(neuron): remove obsolete code paths
    
    * fix(neuron): use neuron_config whenever possible
    
    * fix(neuron): use new cache import path
    
    * fix(neuron): neuron config is not stored in config anymore
    
    * fix(nxd): adapt model retrieval to new APIs
    
    * fix(generator): emulate greedy in sampling parameters
    
    When on-device sampling is enabled, we need to emulate the greedy
    behaviour using top-k=1, top-p=1, temperature=1.
    
    * test(neuron): update models and expectations
    
    * feat(neuron): support on-device sampling
    
    * fix(neuron): adapt entrypoint
    
    * tests(neuron): remove obsolete models
    
    * fix(neuron): adjust test expectations for llama on nxd
    dacorvo authored Jun 10, 2025
    Configuration menu
    Copy the full SHA
    79183d1 View commit details
    Browse the repository at this point in the history

Commits on Jun 11, 2025

  1. [gaudi] Perf optimization (#3256)

    Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
    sywangyi authored Jun 11, 2025
    Configuration menu
    Copy the full SHA
    8394776 View commit details
    Browse the repository at this point in the history

Commits on Jun 12, 2025

  1. [gaudi] Vlm rebase and issue fix in benchmark test (#3263)

    Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
    sywangyi authored Jun 12, 2025
    Configuration menu
    Copy the full SHA
    613b8dd View commit details
    Browse the repository at this point in the history
  2. [gaudi] Move the _update_cos_sin_cache into get_cos_sin (#3254)

    Signed-off-by: yuanwu <yuan.wu@intel.com>
    yuanwu2017 authored Jun 12, 2025
    Configuration menu
    Copy the full SHA
    25fdc5f View commit details
    Browse the repository at this point in the history
  3. [Gaudi] Remove optimum-habana (#3261)

    Signed-off-by: yuanwu <yuan.wu@intel.com>
    yuanwu2017 authored Jun 12, 2025
    Configuration menu
    Copy the full SHA
    e07056a View commit details
    Browse the repository at this point in the history

Commits on Jun 13, 2025

  1. [gaudi] HuggingFaceM4/idefics2-8b issue fix (#3264)

    Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
    sywangyi authored Jun 13, 2025
    Configuration menu
    Copy the full SHA
    a220e57 View commit details
    Browse the repository at this point in the history
  2. [Gaudi] Enable Qwen3_moe model (#3244)

    Signed-off-by: yuanwu <yuan.wu@intel.com>
    yuanwu2017 authored Jun 13, 2025
    Configuration menu
    Copy the full SHA
    ded4cb5 View commit details
    Browse the repository at this point in the history
  3. [Gaudi] Fix the integration-test issues (#3265)

    Signed-off-by: yuanwu <yuan.wu@intel.com>
    yuanwu2017 authored Jun 13, 2025
    Configuration menu
    Copy the full SHA
    3752143 View commit details
    Browse the repository at this point in the history

Commits on Jun 17, 2025

  1. [Gaudi] use pad_token_id to pad input id (#3268)

    Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
    sywangyi authored Jun 17, 2025
    Configuration menu
    Copy the full SHA
    0627983 View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2025

  1. Configuration menu
    Copy the full SHA
    b4d17f1 View commit details
    Browse the repository at this point in the history
  2. [gaudi] Refine logging for Gaudi warmup (#3222)

    * Refine logging for Gaudi warmup
    
    * Make style
    
    * Make style 2
    
    * Flash causal LM case
    
    * Add log_master & VLM cases
    
    * Black
    regisss authored Jun 18, 2025
    Configuration menu
    Copy the full SHA
    f13e28c View commit details
    Browse the repository at this point in the history
  3. doc: fix README (#3271)

    dacorvo authored Jun 18, 2025
    Configuration menu
    Copy the full SHA
    bd1bdeb View commit details
    Browse the repository at this point in the history
  4. chore: release 3.2.3

    dacorvo committed Jun 18, 2025
    Configuration menu
    Copy the full SHA
    1754b79 View commit details
    Browse the repository at this point in the history
Loading