Matthew Chingos and James Carter Share How to Make the Most of the Urban Institute’s Education Data Portal
Matthew Chingos and James Carter Share How to Make the Most of the Urban Institute’s Education Data Portal
 
Print

March 2024

Matthew M. Chingos directs the Center on Education Data and Policy at the Urban Institute. Chingos and his colleague James Carter oversee Urban’s Education Data Portal, which draws on data from most major national data sources on schools, districts, and colleges, coordinating variables and documentation to make it easier to look at trends and combine data. Chingos presented on the Data Portal and its online download tool, the Education Data Explorer, at the AERA-NSF Fall Research Conference in November 2023.
 

Q. What is the Portal and what was the inspiration for developing it?

Matt: The Portal is a one-stop shop for all major national data sets, on schools, districts, and colleges. The entire alphabet soup of data sets that education researchers know and love—CCD, CRDC, SAIPE, NHGIS, IPEDS, EDFacts—is available under one roof, with standard variable names across data sets and over time. We provide all available years of data, going back to the 1980s in some cases, and add new years of data as they come online.

The Portal initially launched in 2018, but we’ve added a lot of new data sets and ways to access them since then.

The story behind the Portal goes back to earlier in my career in education policy research. During my four years working at the Brookings Institution, I realized that informing policy conversations in real time requires quickly assembling the most relevant data. Producing one simple chart showing trends over the last 10 years is easy in theory but in practice requires hours or even days of harmonizing variable names.

When I came to Urban in 2015, I realized that our best-in-class research, communications, and technology and data science teams were uniquely positioned to collaborate on a fix to that problem.

Q. What makes the Portal unique?

Matt: The federal government makes new data sets available every year, but they change every year and those changes add up over time. The Urban Portal is the only place where you can get harmonized data sets, in some cases going back four decades. And it means that rather than learning how to access many different data sets, you just need to learn how to work with the Portal (through its many access points, including Stata and R packages and direct file downloads), and then you can access any national data set on schools, districts, or colleges.

The Portal is also the place where we host Urban-created data products like the first-ever nationally comparable measure of student poverty at the school level, which you can’t get anywhere else.

Q. How are researchers currently using the Portal?

James:
The Portal is a huge time saver, especially for researchers looking to use the enormous amount of administrative data collected by the federal government about education. Instead of each researcher or research group downloading, cleaning, and analyzing the data individually, we do the first two steps and leave them with the analysis time.


Q. What are some examples of research questions that scholars could explore through the Portal?

James: If you wanted to look at the relationship between per pupil spending and student-teacher ratios across school districts, normally you would need to download three files, clean them, merge them together, do some calculations, and then make some visualizations. For a Portal user, all of these data are in one place.

Or if you want to examine the relationship between college graduation rates, post-college earnings, and university endowments, without the Portal you’d also need to work with three different data sets. A Portal user can get all of those data formatted in a similar way and easy to merge.

Q. How does one access the Portal? How can you download data for individual use?

James: Maybe the strongest feature of the Portal is the API that allows users to access the data in many different ways. If you go to https://educationdata.urban.org/, you will find yourself at the start of our Education Data Explorer. The explorer is a point-and-click tool that walks the user through finding the data they would like and then at the end produces a few csv files of data and documentation that they can work with in Excel or another software package like Stata or R.

If you are a Stata or R user, we have built packages for those languages (in each case called “educationdata”) that will access the data and download and label it for a user. Example syntax for each package can be found on our documentation page: https://educationdata.urban.org/documentation. If you are using a tool like Tableau or Power BI, making API calls is straightforward to bring data directly into its visualization capabilities. There are also csvs available for download of full data sets on the documentation page.

Q. What training do you offer on using the Portal?

James: We are happy to answer questions that come in to educationdata@urban.org, or users can check our FAQs. We have been doing trainings at conferences and a few presentations to classes and predoc programs; we offer these on request.

Q. What plans do you have for further building out the Portal? What additional data sets do you plan to add?

James: In addition to our normal data updates, there are a few near-term additions to the Portal that we are really excited about. We will be adding FAFSA completion at the high school level (we already have FAFSA submissions by postsecondary institution). We are working on the Private School Universe Survey to round out our coverage of elementary and secondary schools in the United States. We have also started modifying the infrastructure of the Portal to accommodate state-level data sets. When that is complete, we will add some NAEP data.

We always welcome suggestions from users on additional data sets or functions we should consider adding to the Portal. You can email us at educationdata@urban.org.

Q. What restrictions or requirements should users be aware of?

Matt: The Portal only contains national data sets and given the time it takes for national data providers (mostly the federal government) to process data, the most recent data are often a couple of years old. States and localities often have more detailed or more recent data, but those data only cover a single place.

Users should also be aware that the Portal does not typically address underlying quality issues with the data we source from their original sources. For example, we do not attempt to solve issues with accuracy and missingness in prior years of the Civil Rights Data Collection. The Portal makes a wide range of data sets easier to access, but researchers should still make sure they understand the limitations of the underlying data.