Maximal Marginal Relevance
Last updated
Last updated
On the other hand, Maximal Marginal Relevance is a technique that is used to balance relevance and diversity in the results of a search query. It's particularly useful when you want to avoid redundancy in the information presented to users. MMR works by selecting items that are not only relevant to the query but also differ maximally from items already selected. This is often used in summarisation tasks, where you want to cover different aspects of a topic without repeating the same information.
If the goal is to retrieve information quickly from a vast dataset where exhaustive search isn't feasible, approximate similarity search would be suitable.
If the aim is to provide a broad, diverse set of results to a query, avoiding redundant information, MMR would be the better choice.
Parameters for MMR search:
k Parameter: In MMR, there isn't typically a "k parameter" as you would have in k-nearest neighbours. MMR isn't about finding a set number of nearest items; it's about selecting items one at a time to create a ranked list where each new item is chosen based on a combination of relevance and diversity compared to items already selected.
Fetch k Parameter: In some implementations of MMR, especially when integrated into a search system or a recommendation system, a "fetch k parameter" might be used to denote the number of items you want to retrieve or rank using MMR. For example, if you want to display 10 search results, you would set your "fetch k" to 10, and the MMR algorithm would iterate, selecting items until it has chosen 10 items that balance relevance and diversity according to the lambda parameter.
Lambda (λ) Parameter: This is the parameter specific to the MMR algorithm. Lambda (λ) dictates the trade-off between relevance and diversity in the set of items selected by the algorithm. It usually ranges between 0 and 1:
When λ is close to 1, the algorithm prioritises relevance, making the result set more focused on the query but potentially less diverse.
When λ is close to 0, the algorithm prioritises diversity, ensuring a wider range of results at the possible expense of individual item relevance.