Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
In this project we develop different methodologies for estimating the regression parameters under the assumption that the response and predictors are not jointly observed but are brought together via an error-prone record linkage process that could create mismatches and missed-matches.