Skip to content

Conference Schedule

Day 1

Time Speaker Title Abstract
9:00 AM - 9:30 AM REGISTRATION
9:30 AM - 11:00 AM Prof. Preslav Nakov Towards Truly Open, Language-Specific, Safe, Factual, and Specialized Large Language Models
View Abstract First, we will argue for the need for fully transparent open-source large language models (LLMs), and we will describe the efforts of MBZUAI's Institute on Foundation Models (IFM) towards that based on the LLM360 initiative. Second, we will argue for the need for language-specific LLMs, and we will share our experience from building Jais, the world's leading open Arabic-centric foundation and instruction-tuned large language model, Nanda, our open-weights Hindi LLM, Sherkala, our open-weights Kazakh LLM, and some other models. Third, we will argue for the need for safe LLMs, and we will present Do-Not-Answer, a dataset for evaluating the guardrails of LLMs, which is at the core of the safety mechanisms of our LLMs. Fourth, we will argue for the need for factual LLMs, we will discuss the factuality challenges that LLMs pose. We will then present some recent relevant tools for addressing these challenges developed at MBZUAI: (i) OpenFactCheck, a framework for fact-checking LLM output, for building customized fact-checking systems, and for benchmarking LLMs for factuality, (ii) LM-Polygraph, a tool for predicting an LLM's uncertainty in its output using cheap and fast uncertainty quantification techniques, and (iii) LLM-DetectAIve, a tool for machine-generated text detection. Finally, we will argue for the need for specialized models, and we will present the zoo of LLMs currently being developed at MBZUAI's IFM.
11:00 PM - 11:15 PM TEA BREAK
11:15 AM - 12:45 PM Prof. Tanmoy Chakraborty Don't underestimate the power of small language models
View Abstract Despite the superior performance demonstrated by Transformer-based LLMs across numerous applications involving natural languages, their high computational cost, energy consumption, and limited accessibility underscore the need for efficient, interpretable, and adaptable small language models (SLMs). This talk highlights methods to develop economical and interpretable SLMs that rival their larger counterparts in performance without significant computational requirements. Our research emphasizes three key dimensions: economical resource usage, adaptability to diverse and low-resource tasks, and enhanced interpretability. Techniques like competitive knowledge distillation, leveraging student-teacher dynamics, and activation sparsity in manifold-preserving transformers demonstrate significant efficiency gains without compromising performance. We formulate novel decomposer components for LLMs for modularizing problem decomposition and solution generation, allowing smaller models to excel in complex reasoning tasks. We also propose innovative prompt construction and alignment strategies that boost in-context knowledge adaptation in low-resource settings for SLMs. Our findings demonstrate that SLMs can achieve scalability, interpretability, and adaptability, paving the way for broader and sustainable AI accessibility.
12:45 PM - 1:45 PM LUNCH
1:45 PM - 3:45 PM Sahil Mishra Retrieval-Augmented Language Models – Bridging LLMs with Efficient Knowledge Retrieval
View Abstract Large Language Models (LLMs) are powerful but have limitations like forgetting recent information and hallucination. Retrieval-Augmented Language Models (RAG) solve these problems by allowing models to fetch relevant information from external sources instead of relying only on what they were trained on. This session will cover how retrieval-based models work, the different ways they retrieve information (like using sparse and dense retrieval methods), and how they improve accuracy and efficiency. We will explore models like kNN-LMs, REALM, RETRO, and RAG, showing how they use retrieval to enhance responses. Additionally, we will discuss strategies for improving retrieval, aligning retrieved knowledge with model outputs, and refining prompts for better results, especially in low-resource settings. By combining retrieval with language models, we can build smaller, more efficient, and more reliable AI systems that provide accurate, well-supported answers in real-world applications.
3:45 PM - 4:00 PM TEA BREAK
4:00 PM - 6:00 PM Ankush Chander LLM finetuning: Fundamentals and best practices
View Abstract Large language models have transformed the field of NLP by performing well on tasks that were previously not reachable. Even though LLMs have great general language capabilities, sometimes it's not enough for the application specific tasks. Fine-tuning allows users to adapt pre-trained LLMs to more specialized tasks. By fine-tuning a model on a small dataset of task-specific data, you can improve its performance on that task while preserving its general language knowledge. In this session, we will discuss finetuning basics, memory optimization techniques like Quantization, LoRA and finetune some LLMs along the way.

Day 2

Time Speakers Titles Abstract
9:00 AM - 10:30 AM Harsha Kokel A brief tutorial on LLMs for AI Planning
View Abstract Recent work advancements in Large Language Models (LLMs), have spurred approaches for planning in natural language. The approaches vary widely from giving a planning problem to an LLM and asking it to output an entire plan to asking an LLM to plan step by step, including backtracking. In this talk, I will give a brief overview of AI Planning and how the advancements in LLMs are being used for planning in natural language. I will cover a few different approaches, discuss important properties to consider, and present some benchmarks for evaluation.
10:30 PM - 10:45 PM TEA BREAK
10:45 AM - 12:15 PM Sandipan Dandapat Safe and Inclusive AI for a Multilingual and Multicultural World
View Abstract In this talk, I will explore the challenges and innovations in large language models. I will delve into the complexities of scaling language models, addressing issues such as power versus cost and the responsibilities associated with powerful AI. The talk will then describe two research works in detail. The first focuses on SAGE: Safety AI Generic Evaluation, a novel approach to ensuring AI safety and examining the biases and stereotypes in large language models. The second research work will present the Linguistically Informed Testing of Multilingual Systems (LITMUS) project, which aims to support universalization through linguistically informed training and testing strategies. Finally, I will conclude with an outlook and future directions for work in this field.
12:15 PM - 1:30 PM LUNCH
1:30 PM - 3:00 PM Manish Gupta A Brief tutorial on Retrieval Augmented Generation
View Abstract In this talk, I will introduce the recently popular concept of retrieval augmented generation. We will start with retrieval augmentation for classification (REALM) and then extend the framework to generation (RAG). Then we will deliberate on how to scale these models to trillion-sized collections (RETRO) and how to combine retrieval-augmentation with few-shot learning (ATLAS). Lastly, I will talk about Internet-augmented generation, and application of RAG in AutoSuggest. Towards the end, I will also briefly touch upon multimodal RAG.
3:00 PM - 3:15 PM TEA BREAK
3:15 PM - 5:15 PM Harish Yenala Practical Fine-Tuning of SLMs: Techniques, Applications, and Performance Insights
View Abstract This hands-on session will focus on fine-tuning Small Language Models (SLMs) for specific applications. We'll cover the importance of SLMs, their limitations, and the need for fine-tuning using methods like Parameter-Efficient Fine-Tuning (PEFT) and QLoRA. Through a real-world example on Hate Speech Text Classification, participants will gain insights into data preparation for fine-tuning, the process of fine-tuning the Phi-3 model, and evaluating its impact. We will also touch upon the performance and latency comparisons between various SLM models, showcasing their practical implications in real-time applications.