Using the HRS

Insights and advice for new and seasoned users of the Health and Retirement Study

Featured Article: Cognition and Dementia

Assessment of Cognition Using Surveys and Neuropsychological Assessment: The Health and Retirement Study and the Aging, Demographics, and Memory Study
Eileen M. Crimmins, Jung Ki Kim, Kenneth M. Langa, and David R. Weir
J Gerontol B Psychol Sci Soc Sci (2011) 66B (suppl 1): i162-i171. doi: 10.1093/geronb/gbr048

One of the best resources for HRS users, beyond the extensive documentation provided by HRS, are the published articles evaluating the HRS sample and data. I rely heavily on these articles for gaining insights into issues surrounding data quality and data use.

Continue reading

What are Proxies?

Much of the value of the HRS as a study of aging is in its use of a number of innovative survey research techniques, including the use of proxy interviews. Researchers who include respondents with interviews by proxy in their analytic sample should be well informed about what a proxy interview is and how these respondents differ from the rest of the sample.

Continue reading

Why are There People Under Age 50 in the Data?

(The short answer: HRS samples households, not individuals, and in some households people over age 50 are married to people under age 50)

The very first conundrum I encountered as a new user of HRS data was the presence of people as young as age 25 in the data set. First I puzzled over how a survey of adults nearing or beyond retirement could contain hundreds of respondents in their 20s, 30s, and 40s. Then I puzzled over how to exclude these younger folks from my analysis. Fortunately, it’s a relatively straight forward matter.

Continue reading

Leave a comment

Beware of version updates!

Data management of large, complex surveys like the HRS is a fairly difficult and time-consuming task. The HRS staff endeavors to produce a public-release data set as soon as possible (and they produce them much faster than many other publicly funded data collections!), so they release an early version of the data: Core Early Release (V1.0)*. The updated RAND files tend to follow soon after.

HRS does an early release to get the data to the public as fast as possible, but HRS staff continues to process the data until they have a final data release. Sometimes the early release is ultimately designated as the final release – you’ll know this is the case if you see “Final V1.0” followed by the date of release. But sometimes there are issues in the data that need to be resolved. These tend to be the result of programming errors, but there are sometimes problems with the data that the HRS staff catch during data inspections after the early release (e.g., a case is designated as a non-sample member after closer inspection). So, in some years the final release is a V1.0, but in other years it may be a V2.0 or V3.0. And in rare cases the final release is a V4.0 or V5.0.

What is the significance of all of this for the user? Continue reading

Leave a comment


First a disclaimer. I choose the title of this post to reflect how I think others view these data products. These are not, in fact, competing data products, nor are they alternatives to one another. I view these data products as complementary, each with individual strengths and weaknesses, but ultimately a powerful data analysis resource when combined. Below I briefly describe the difference between the two data products and outline the pros and cons of using each (though my ultimate recommendation is to combine the two data products for a “pros-only” data set).

Continue reading

Leave a comment

Introducing the Blog

On Feb. 5th, 2009 I became a registered HRS user. I was finishing my last semester in a PhD program  and, as I’d spent the previous years working on a different national, longitudinal study of adult health, I felt ready to tackle this new data set with which I could pursue additional topics in aging-related research. So, I downloaded some data and started exploring. Within minutes I had questions… so, so many questions (like, what are all of these files and why are there 20-year-olds in a study of older adults!?!)… and I was stuck. In retrospect, I should have devoted much more time to studying the documentation.

Fortunately I was a grad student at the University of Michigan at the time, working in the same building as the HRS and many of its users. I was helped in my early days of using the HRS by some very friendly staff (thanks Gwen Fisher!) and faculty (thanks Philippa Clarke!). Since then I have continued to rely on the generous nature of HRS personnel and other HRS users to solve my data conundrums. I’ve also read many of the informative documents provided by the HRS to its users. I now help others who are getting started with HRS data. And you know what? They encounter the same problems I did, and have many of the same questions.

The HRS can seem formidable to new users. To be honest, I’ve been using the data for more than seven years and it still seems formidable to me at times. But this data is worth it, my friend, it’s definitely worth it. HRS data provide so many exciting opportunities for conducting studies that make significant contributions to debates, both scientific and policy, on the nature of aging in the U.S. and globally. The trick is to successfully navigate past the common user errors and data pitfalls as you journey from new user to expert user. This blog is intended to help you on your HRS journey, whether you are just getting started or have been working with the data for years.