Unit 3
Unit 3
Unit 3
By Dr Ashaq
The library supports models for various tasks, including text classification, named entity recognition, question answering, and
text generation. It also provides tools for tokenization, data preprocessing, and model evaluation. With Hugging Face
Transformers, developers can quickly prototype and deploy NLP solutions using cutting-edge models like BERT, GPT, and
T5.
❑ Support for multiple architectures: Includes models like BERT, GPT, RoBERTa, T5, and many others.
❑ Pre-trained models: Offers a vast library of pre-trained models ready for fine-tuning or use out-of-the-
box.
❑ Easy-to-use API: Provides a straightforward API for loading, training, and evaluating models.
❑ Integration with PyTorch and TensorFlow: Supports both major deep learning frameworks.
❑ Model hub: Centralized repository for sharing and discovering pre-trained models.
• Hugging Face provides an extensive Model Hub where users can explore thousands of pre-trained models for a
wide range of NLP tasks, such as text classification, summarization, and question answering.
• Users can filter models by task, language, and architecture, and access pre-trained models with a simple API
call.
• Interactive demos and documentation help users experiment with models without needing to write code
immediately.
• Hugging Face doesn't have a specific feature called "Gemma," so it might refer to an external project or a
specific tool you might be referencing. However, users often extend Hugging Face models with additional tools
Gemma, on the other hand, is a more recent addition to the Hugging Face model zoo. It represents another step forward in
language model capabilities, offering improved performance on certain tasks and potentially new features or architectural
innovations. Both models can be easily accessed and fine-tuned using the Hugging Face Transformers library, enabling
researchers and developers to leverage their power for various applications.
• Generalization issues: Models trained on specific datasets might fail to generalize to new tasks or languages.
• Inability to handle long-term dependencies: Basic models often struggle with understanding context across
• Wav2Vec 2.0.
• Hubert.
• MarianMT.
• T5.
By Dr Ashaq Hussain Bhat
Key difference between generative and discriminative language models:
• Generative models aim to model the joint probability distribution P(x,y)P(x, y)P(x,y),
enabling them to generate new data instances by predicting both the input features xxx and
labels yyy. In NLP, generative models can produce new sequences, such as text generation
tasks.
where they aim to classify data by learning the decision boundary between classes. These
models are typically used for tasks like classification, where they aim to distinguish between
predefined categories.
By Dr Ashaq Hussain Bhat
Role of query expansion in improving information retrieval:
• Query expansion helps improve search precision by adding semantically related terms to a query.
• Language models like BERT or GPT-3 can be used for query expansion by predicting relevant terms or
reformulating the query to include synonyms or related phrases.
Fairness in AI aims to ensure that models do not discriminate against certain groups or individuals based on protected attrib utes
such as race, gender, or age. Researchers and developers must work to identify and mitigate biases in language models, employing
techniques such as debiasing algorithms, diverse and representative training data, and regular audits of model outputs.
• Bias can occur when a model reflects social, gender, or racial biases present in the training data.
• For example, a language model may generate biased text or exhibit stereotypical associations in tasks like
sentiment analysis.
• Pre-trained models in areas like healthcare or criminal justice pose risks of biased predictions.
• Mitigation includes thorough testing, bias evaluation, and ensuring transparency by documenting model
While large language models (LLMs) have made significant advancements, their deployment raises ethical concerns:
• Bias and Fairness: LLMs may perpetuate societal biases present in the training data, leading to discrimination in areas like hiring or law
enforcement. For example, biased outputs in recruitment AI tools could unfairly disadvantage minority groups.
o Mitigation: Techniques like bias mitigation, algorithmic fairness checks, and diverse training datasets can help reduce biases.
• Misinformation: LLMs can generate misleading or incorrect information that might be taken as fact, exacerbating the spread of
misinformation.
o Mitigation: Implementing fact-checking layers and transparency about model limitations could help mitigate this risk.
• Privacy: LLMs can unintentionally memorize sensitive data (e.g., personal information), leading to privacy violations.
o Mitigation: Using techniques like differential privacy and responsible data governance can help protect user privacy.
• Environmental Impact: Training large models consumes vast computational resources, contributing to the carbon footprint.
o Mitigation: Green AI initiatives and more efficient model designs (e.g., distillation or pruning) can reduce energy consumption.
• Computational Resources: Pre-training requires significant hardware, time, and energy, making
• Data Quality: Pre-training on large datasets can lead to the model learning from noise or biased
• Catastrophic Forgetting: During fine-tuning, the model may forget important general
• Overfitting: Fine-tuning on small datasets can cause the model to overfit, reducing
generalizability.
By Dr Ashaq Hussain Bhat
Machine Translation
Machine translation involves automatically translating text from one language to another. The field has evolved through several key
stages:
1. Rule-Based Translation (RBT): Early systems relied on grammatical rules and bilingual dictionaries.
o Advantages: Rule transparency and grammatical accuracy.
o Disadvantages: Inflexibility, requiring extensive manual rule crafting.
o Example: SYSTRAN, used by the European Union in early translations.
2. Statistical Machine Translation (SMT): Uses probabilistic models based on bilingual corpora to predict translations.
o Advantages: Adaptability, better handling of real-world variability.
o Disadvantages: Requires large amounts of bilingual data, struggles with context and fluency.
o Example: Google Translate before the neural era.
3. Neural Machine Translation (NMT): Leverages deep learning models (often based on Transformers) to generate translations.
o Advantages: Context-aware, fluent, and able to generalize better across languages.
o Disadvantages: Requires vast data and computational resources, can produce over-smooth translations.
o Example: Google Translate and OpenNMT.
We can expect to see increased integration of language models with other AI domains, such as computer vision and robotics, leading to more sophisticated
multimodal systems. Ethical considerations will continue to play a crucial role, with ongoing efforts to develop more transparent, fair, and controllable
language models. As these technologies advance, they have the potential to revolutionize human-computer interaction and unlock new possibilities across
various industries and applications.