Introduction to Information Retrieval Course | IISER Kolkata | Prof. Dwaipayan Roy
Course Details
| Exam Registration | 2679 |
|---|---|
| Course Status | Ongoing |
| Course Type | Elective |
| Language | English |
| Duration | 12 weeks |
| Categories | Computer Science and Engineering |
| Credit Points | 3 |
| Level | Undergraduate/Postgraduate |
| Start Date | 19 Jan 2026 |
| End Date | 10 Apr 2026 |
| Enrollment Ends | 02 Feb 2026 |
| Exam Registration Ends | 20 Feb 2026 |
| Exam Date | 25 Apr 2026 IST |
| NCrF Level | 4.5 — 8.0 |
Introduction to Information Retrieval: Your Gateway to the Science of Search
In today's data-driven world, the ability to find relevant information efficiently is paramount. The Introduction to Information Retrieval (IR) course, offered by the Indian Institute of Science Education and Research (IISER), Kolkata, and led by Prof. Dwaipayan Roy, provides a deep dive into the foundational science behind search engines and modern document retrieval systems. This 12-week journey is meticulously designed for undergraduate and postgraduate students to master the core principles and cutting-edge applications of IR.
Meet Your Instructor: Prof. Dwaipayan Roy
Prof. Dwaipayan Roy brings a wealth of expertise to this course. He earned his PhD in Computer Science from the prestigious Indian Statistical Institute, Kolkata, and further honed his research skills as a Post-doctoral Researcher at GESIS – Leibniz Institute for the Social Sciences in Cologne, Germany. Currently serving as an Assistant Professor at IISER Kolkata, Prof. Roy's academic and research background ensures that the course content is both rigorous and aligned with contemporary industry and research trends.
Course Overview and Prerequisites
This course offers a comprehensive introduction to Information Retrieval, balancing theoretical concepts with practical, hands-on implementation. It is structured to prepare students for advanced academic research as well as lucrative industry roles in technology.
Prerequisites:
- Basic knowledge of programming languages like Java or Python.
- Fundamental understanding of data structures and algorithms.
Industry Support & Recognition:
Information Retrieval is a critical skill across the tech landscape. This course is highly valued by leading companies, including:
- Google: For core work in web search, document retrieval, and question-answering systems.
- Microsoft: In products like Bing, Azure Cognitive Search, and Office 365 search features.
- Amazon: For enhancing product search and AI services like Amazon Kendra on AWS.
Detailed 12-Week Course Layout
Weeks 1-4: Foundations & Indexing
The course begins by establishing the core building blocks of any IR system.
- Week 1: Introduction, text processing, tokenization, Boolean retrieval, and the fundamental Inverted Index structure.
- Week 2: Deep dive into index construction (BSBI, SPIMI), compression techniques, and understanding key laws like Zipf's and Heaps'.
- Weeks 3 & 4: Hands-on introduction to PyLucene. Students learn to build indexes, analyze non-English text, and use tools like Luke for index viewing.
Weeks 5-7: Retrieval Models
This module focuses on moving beyond simple Boolean matching to ranked retrieval, which powers modern search.
- Week 5: Introduction to the Vector Space Model (VSM), TF-IDF weighting, and Cosine Similarity.
- Week 6: Probabilistic models, including the foundational BM25 family of algorithms and their relevance today.
- Week 7: Language Modeling approaches for IR, covering estimation, the zero-frequency problem, and smoothing techniques (Jelinek-Mercer, Dirichlet).
Weeks 8-10: Evaluation & Advanced Topics
Learning to build a system is incomplete without knowing how to measure its effectiveness.
- Week 8: Advanced PyLucene queries and using divergence measures (KLD, JSD) in retrieval.
- Week 9: Comprehensive coverage of evaluation metrics: Precision, Recall, F-measure, MAP, nDCG, and hypothesis testing.
- Week 10: Practical work with benchmark datasets (e.g., TREC), using `TREC_EVAL`, and introduction to Relevance Feedback methods like Rocchio.
Weeks 11-12: Modern IR Applications
The course concludes by exploring specialized and contemporary applications of IR.
- Week 11: Web Search fundamentals: crawlers, shingling, link analysis algorithms (PageRank, HITS), and an introduction to SEO.
- Week 12: Cutting-edge topics including Learning to Rank, Latent Semantic Indexing, and an introduction to embeddings (Word, Sentence, Document) and the application of BERT and Large Language Models (LLMs) in IR.
Recommended Textbooks
To supplement the course material, students are encouraged to refer to these authoritative texts:
| Book Title | Authors | Link |
|---|---|---|
| Introduction to Information Retrieval | C. Manning, P. Raghavan, H. Schutze | https://nlp.stanford.edu/IR-book/ |
| Information Retrieval: Implementing and Evaluating Search Engines | S. Buttcher, C. L. A. Clarke, G. Cormack | - |
Who Should Take This Course?
This course is ideally suited for:
- Computer Science and Engineering students (UG/PG) looking to specialize in search, NLP, or data science.
- Aspiring researchers interested in the academic foundations of information access.
- Developers and engineers aiming to build careers at tech giants focused on search, recommendation, or AI-driven analytics.
By the end of this 12-week program, learners will have gained a solid theoretical understanding and practical experience in building and evaluating information retrieval systems, positioning them at the forefront of a critical and ever-evolving field.
Enroll Now →