Search with Machine Learning
We’ve designed this course to cover the fundamentals of integrating machine learning and natural language processing techniques into search engines. We’ll dive into using machine learning for ranking, content understanding, and query understanding, along with how to use embeddings, dense vectors and deep learning to improve retrieval and ranking. You will build applications using OpenSearch (an open fork of Elasticsearch) and several machine learning libraries and plugins.
This course assumes that you have already learned search fundamentals; if not, then we encourage you to take our companion “Search Fundamentals” course so that you will be prepared for this one.
Former CTO at Wikimedia
Grant is a CTO, independent consultant and advisor. He is the former CTO of the Wikimedia Foundation and the co-founder and ex-CTO of Lucidworks, co-author of Taming Text, co-founder of Apache Mahout and a long-standing committer on the Apache Lucene and Solr open source projects. Grant’s experience includes managing a large team of engineers, researchers and data scientists at a top ten website as well as engineering a variety of search, question answering, and natural language processing applications for a variety of domains and languages. He earned his B.S. from Amherst College in Math and Computer Science and his M.S. in Computer Science from Syracuse University.
Machine Learning Consultant
Daniel is an independent consultant specializing in search, machine learning / AI, and data science. He was a founding employee of Endeca, a search pioneer that Oracle acquired in 2011. He then led engineering and data science teams at Google and LinkedIn. He’s worked with a wide range of consulting clients, including Apple, eBay, Pinterest, Salesforce, Yelp, and Zoom. He wrote a book on Faceted Search, published by Morgan & Claypool, and he blogs on Medium about search-related topics — particularly query understanding. Daniel has degrees in Computer Science and Math from MIT and a PhD in computer science from CMU.
Information retrieval has been around for decades, but recent developments have brought search engines to the forefront of the digital age. Companies, from global technology giants to small retailers and publishers, rely on search as a gateway to information, commerce, and all manner of products and services. And, unlike most of the other ways we interact with the internet, search puts the user in the driver’s seat, starting with the user’s explicitly expressed intent in the form of a search query.
Surprisingly, the fundamentals of search engines are not part of the mainstream computer science or software engineering curriculum. Which is a shame, because search engines not only enable useful applications, but are a great domain to which we can apply the latest developments in machine learning and AI.
Over the next four weeks, we will cover the core machine learning and natural language processing capabilities you need to improve retrieval and ranking, as well as to understand queries and content. We’ll use Opensearch, an open fork of Elasticsearch, but what you learn will apply to other search platforms, such as Apache Solr and Vespa. We hope you not only come away from this course with practical skills, but also with a passion for search as a subject and with new insights about how machine learning can improve search.
- How to model and measure search relevance
- Relevance, ranking, diversity – how they all fit together
- How to use machine learning for ranking results
- Techniques for content classification and annotation, such as entity recognition
- Supervised and unsupervised machine learning methods for content understanding
- How to build, evaluate, and improve a neural content classifier
- Fundamentals and key components of query understanding
- How to combine query understanding with retrieval and ranking
- How to build, evaluate, and improve a neural query classifier
- Using dense vector representations for semantic search
- Indexing and querying approaches for dense vectors
- Using vectors for query and content similarity
I feel that taking the course helped me better understand the Search domain in my company, which is very useful for my work. Thanks for a great experience!
The best way to learn how to apply ML to search in a collaborative and friendly way !
If you want to learn practical knowledge of using Machine Learning on Opensearch, this course is definitely for you.
It was difficult, but a level of difficulty that will make it proud of yourself! I learnt so much! Exactly what I dreamt of.
I had a blast! 4 weeks of intense immersion, with great instructors, and a very open and sharing community I now have new tools in my toolbox. And new additions to my professional network to ask when the need arises.
I have been working in Search for a long time and 90% of this time I have used Solr / Lucene. Since recently, I thought to myself that there are so many other search engines which have a big market share and is been in quite a demand. I was already contributing the open search integration to quepid (open source relevancy tuning tool from OSC) and few coworkers and other community members started talking about this course. The initial kick to join it, was bare to fast-pace my learning open search engine but soon after the first session got me on my toes :) (for good) I loved the insightful questions and discussions and to be honest some questions helped me open up my perspectives about certain concepts. The craze to submit the project and assignments , to answer questions from others and the urge to help one another , it was one of its kind in this class. Leave alone students , even teachers and the entire gang of organisers made sure that students get what they want to be comfortable and learning during the course (even on weekends). I can proudly say that I have had a 7 days work week since last 4 weeks and I have no complaints as so did everyone in this class and we all learnt !!! There have been somethings that stressed us initially like limited hours on Gitpod , working with new libraries , new environment to deal with and also some bugs and unclear instructions but everyone has been graceful enough to accept and correct and in true sense adapted to the feedback which makes this particular course different. I have delivered Solr trainings myself in the past and I will continue in the future too, but honestly the participation in this course will certainly change my training style. Big Thank you to both the trainers Grant and Daniel and Organizers , it was a an awesome class and best wishes to the future classes!!
A must have course for search engineers and anyone looking for a solid introduction to search with ML. Highly recommended!
The course is really great if you're someone who likes to work hands-on first and learn a long the way as you're doing the assignment. The time schedule is very flexible, lectures are recorded, multiple coding sessions for different time zones. It'd say this is one of the best teaching approaches I've tried so far!
Search with Machine Learning is an incredible course taught by two industry experts and a community of fellow search engineers. It was an intense four weeks but I walked away with a deep understanding of how to build a great search engine. I am already applying these skills in my workplace and will surely leverage this knowledge in the future.
This course is a great resource. It's like a live reference book for anything from Indexing to Query Understanding. The authors are very approachable and they share a lot of their expertise on the techniques and their business applications. And it's good to see so many people working in the field and being so generous with their time to help and share their experience. Excellent overall!
The co:rise Search with Machine Learning class was four solid weeks of the right mix of lectures, eminent guests' talks, weekly project/homework, and on-going near real-time interactions with the industry veterans instructors as well as the co-students. Just that interaction was priceless! As a search engineer, consultant, and practitioner, the co:rise class significantly strengthened my knowledge of, and confidence in, the know-how of Learning-to-Rank, and the use of Machine Learning techniques for content and query classifications, all for improving Relevancy. I strongly recommend this class to anyone involved in designing, developing, and supporting information retrieval ("search") solutions. Thank you Grant, Daniel, Judy, and Amber at co:rise for a great class.
Thanks for the awesome course! Really helped me to improve my skillset in area of Information retreival.
I'm happy to have taken this course: it is well organized, with very supporting and caring teaching crew, Grant and Daniel. Both bring unique perspectives -- engineer's and data scientist's -- into the topic of Search, showing how multifaceted it has become over the years. The atmosphere on the course has been awesome, everyone is so supportive and sharing ideas and asking tough questions or even sharing that little recipe to overcome a programming / infrastructure issue. Saved me a ton of time! I've also enjoyed interacting with Judy during the course as she was timely checking in on how I am doing and what kind of feedback I've got, overall creating a positive and supportive vibe to keep going. I have already recommended the course to one of my clients for providing a practical framework of Thinking like a Search Engineer and Scientist. You will get a shared feeling of that you are not the only person in the world solving a tough Search challenge, and may be even meet new friends in the field.
The whole course was extremely relevant (no pun intended) for my work in search and I could feel the ideas coming up in my head as the weeks progressed.
This has been a great learning experience of applying some of the concepts in machine learning to search!
I can think of no better person to teach a class on 'search with machine learning' than Grant Ingersoll. Through his open source work on the Apache Solr search engine and as a founder of the Apache Mahout machine learning framework, Grant has done more to teach developers about the technology concepts and applications of ML and search than anyone in the business. If you type “machine learning “and “search” into Google, the top result is Lucidworks, a company that Grant was the driving force for. This is a no-brainer, take this class from Grant!
Grant is an established expert in the area of Search, Machine Learning, and AI! He has the mind of a researcher, an educator and has demonstrated the applicability of these deep topic areas into real-world products used by enterprises. What better way to learn than from someone who is a published author, a hands-on practitioner, and an industry expert?
Grant has a unique combination of breadth and depth in the search space, from designing a search solution for performance and stability at scale, while also having deep expertise in the internals of the engine, such as text analysis, relevancy tuning, query optimization, and index design. You won’t find a better teacher to provide a solid foundation in the theory as well as how to apply it when building real world search applications.
Daniel has unparalleled experience in search and machine learning. He brings practical experiences and has consulted for a wide range of companies from Apple to Zoom.
Daniel is a well-known expert on what useful search looks like in practice. Between his founding role as Endeca's chief scientist, his experience at Google and LinkedIn, and his consulting for numerous tech companies and retailers, he has seen and done it all. If you want to learn about relevance, user happiness, and optimizing search applications so they actually help people find things, you cannot do better than a course from Daniel.
Software engineers and data scientists looking to learn more about how search engines work, and specifically where they can best apply machine learning to improve search quality.
Ability to write Python and work with documented libraries.
Comfort working with web applications, Docker basics (e.g. start, stop) and the command line.
Search Fundamentals (https://corise.com/course/search-fundamentals) class certificate or academic or industry experience working with search engines such as Elasticsearch/OpenSearch/Solr/Vespa. We will not be teaching search fundamentals in this course.