Using the HRS

Insights and advice for new and seasoned users of the Health and Retirement Study


What are Proxies?

Much of the value of the HRS as a study of aging is in its use of a number of innovative survey research techniques, including the use of proxy interviews. Researchers who include respondents with interviews by proxy in their analytic sample should be well informed about what a proxy interview is and how these respondents differ from the rest of the sample.

Continue reading


Why are There People Under Age 50 in the Data?

(The short answer: HRS samples households, not individuals, and in some households people over age 50 are married to people under age 50)

The very first conundrum I encountered as a new user of HRS data was the presence of people as young as age 25 in the data set. First I puzzled over how a survey of adults nearing or beyond retirement could contain hundreds of respondents in their 20s, 30s, and 40s. Then I puzzled over how to exclude these younger folks from my analysis. Fortunately, it’s a relatively straight forward matter.

Continue reading


Leave a comment

Beware of version updates!

Data management of large, complex surveys like the HRS is a fairly difficult and time-consuming task. The HRS staff endeavors to produce a public-release data set as soon as possible (and they produce them much faster than many other publicly funded data collections!), so they release an early version of the data: Core Early Release (V1.0)*. The updated RAND files tend to follow soon after.

HRS does an early release to get the data to the public as fast as possible, but HRS staff continues to process the data until they have a final data release. Sometimes the early release is ultimately designated as the final release – you’ll know this is the case if you see “Final V1.0” followed by the date of release. But sometimes there are issues in the data that need to be resolved. These tend to be the result of programming errors, but there are sometimes problems with the data that the HRS staff catch during data inspections after the early release (e.g., a case is designated as a non-sample member after closer inspection). So, in some years the final release is a V1.0, but in other years it may be a V2.0 or V3.0. And in rare cases the final release is a V4.0 or V5.0.

What is the significance of all of this for the user? Continue reading