Venkatesh Prasanna is a Technology Architect and Ayush Kataria is a Specialist Programmer focused on Search at Infosys Wingspan. They recently completed the Search Fundamentals course on CoRise and brought many ideas back to the search system at their company.
Infosys Wingspan is a multi-tenant, cloud-first, mobile-first learning experience platform being used by multiple customers in education, reskilling, knowledge management, and organizational change management. This essentially means that the platform is home to a variety of learning artifacts, and one of the primary requirements is to help users effectively discover the right learning artifacts for them.
When multiple learning resources are available about the same topic, it is difficult for a user to decide what could be the best one for them for their learning journey. Users might wonder which resource is the right level of complexity? Which resources have the right level of depth for my needs?
We have had a robust search system in Wingspan that uses multiple signals to decide which results should be ranked highly when users search for courses to learn. We introduced a primitive "Learning to Rank" approach in early 2022, and started to look for guidance on how to evaluate the search system's overall effectiveness. Primarily, we wanted to know if our approach to "Learning to Rank" -- machine learning based reranking of search results -- was headed in the right direction, or if we could improve further. We also had no validation of the way we were capturing our search-related telemetry. We asked ourselves, “How do we establish if the telemetry data was enough, and if we were using the available information effectively in feeding the data back to the system?”
At this juncture, we came across the Search with Machine Learning course on CoRise taught by two of the pioneers in the field of Information Retrieval. Instructor Grant Ingersoll, is former CTO of the Wikimedia Foundation and a long-standing committer on the Apache Lucene and Solr open-source projects.Instructor Daniel Tunkelang, spearheaded the "guided navigation" idea at Endeca and later at LinkedIn Search, which has paved the way for today's faceted search and aggregation approaches across the search industry. The course was precisely what we needed - it was built for search professionals (CoRise has since introduced a beginner course called Search Fundamentals to address a more beginner audience too) and was using the technology stack close to ours as well with setup on Elasticsearch and hands-on projects using OpenSearch and Python.
As we embarked on this 4 week long course, the first thing that really struck us was how effective the live course was and how approachable the two experts were. In addition, the learning community was also very knowledgeable in the field and that led to wonderful conversions during the live sessions, coding parties, and on the course Slack channels with a lively learning experience. The fact that the learning community was a small unit of around 150 people meant that there was a lot of scope for detailed interactions during the course.
Throughout this course, we got a chance to validate our indexing approach and design with those of the experts teaching the course, got some insights and tips on potentially useful features of Elasticsearch that we had not explored earlier, found performance and scalability improvement ideas for Wingspan, as well as an introduction to the exciting new prospect of “vector search” – thanks to the brilliant community talk by Dimitri Kan during the course. Dimitri himself was one of our co-learners also taking the course, and it goes to show how the community was bringing new ideas to the table throughout the course!
We also got a lot of ideas to go back and enhance our search system. We started off by updating our telemetry approach to better capture the signals of user interaction with search results, we relooked at the features that could help us in creating a better LTR model for our context so we could evaluate the possibility of query understanding and how it could be brought in to our models There were many “aha!” moments we got from taking a step back from our day-to-day business and considering new approaches from our instructors and fellow learners. We shipped many of these updates – including new ideas for search and autosuggestion – to production a couple of months ago!
Lastly, we greatly appreciated how well the course was designed with a focus on hands-on projects. This course is not just a theoretical introduction to the subject, but a proper implementation of a scalable solution on a real-world dataset! Our instructors set up our project environment so learners had to focus specifically on what they are expected to learn in the course, instead of having to spend a lot of time setting up the project or writing the boilerplate code. In addition, every week, there were peer reviews for our project submissions so everyone got active one-to-one feedback on our approach by fellow learners. We learned a lot reviewing other projects as well, as we could see how the same problem could be approached and solved differently by different people, which helped us refine our own work.Even Grant and Daniel were one Slack ping away from detailed help on our code if we were stuck! We would certainly recommend this course and the Search Fundamentals course to anyone working in the search and information retrieval domain, as well as anyone looking to enter the domain.