Using Sampled Data and Regression to Merge Search Engine Results

By Luo Si et al
Published on Aug. 11, 2015
Read the original document by opening this link in a new tab.

Table of Contents

1. ABSTRACT 2. INTRODUCTION 3. PRIOR RESEARCH 4. REGRESSION MODEL 5. Model Adjustment

Summary

This paper addresses the problem of merging results obtained from different databases and search engines in a distributed information retrieval environment. It introduces a new approach using sampled data and regression to improve the effectiveness of results merging algorithms. The research focuses on uncooperative environments with multiple types of independent search engines. By building functions that transform document scores from different search engines into a normalized form, the paper demonstrates the superiority of this new approach over existing methods. The study presents experimental results comparing the new approach to the CORI resource selection algorithm and highlights its effectiveness in diverse network environments. The paper also discusses related work, prior research on acquiring resource descriptions, and the challenges of results merging in distributed IR. The regression model proposed in the paper offers a solution to the merging problem by mapping database-specific document scores to centralized scores, enabling efficient merging of ranked lists from different databases.
×
This is where the content will go.