Annotated Bibliography for Matching

This has been written up primarily to keep my thoughts in order about a current research topic, namely, the interaction between matching and disclosure limitation. Don't take anything I say here too seriously :-)

If you're reading this, you should probably also know: Bill Winkler (mentioned multiple times below) has recently written an extensive survey on record linkage work.

The statistical literature

String edit based metrics for record linkage

A nice description of various edit distance algorithms, together with some probabilistic versions of these algorithms based on HMMs, can be found in the book Biological Sequence Analysis : Probabilistic Models of Proteins and Nucleic Acids by Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison.

Matching workshops and other resource pages

