if (!require(devtools)) install.packages("devtools")

devtools::install_github("snfagora/MapAgora")
library(MapAgora)

IRS filings

IRS data provide comprehensive demographic and financial information on civic organizations.

Load the IRS index

To begin, load the IRS index for a selected year.

# load 2018 index file from the IRS Amazon Web Server
idx_2018 <- import_idx(2018)

Get the AWS URL for the filing of interest

Digitized 990 data are available as XML files in an Amazon Web Server (AWS) repository. The yearly index file organizes these files. The year, in this case, refers to the year the IRS processed the file. For example, all the files in the 2018 index were processed in 2018. However, the tax filing itself can be from any previous date. This discrepancy makes it challenging to locate a specific filing for a given organization and year.

To remedy this, we rebuilt a comprehensive index file for this package. This index, included in the package, contains the additional fields TaxPeriodBeginDt, TaxPeriodEndDt, and (importantly) Tax_Year. Tax_Year refers to the tax year of the 990 form the organization submitted, but IRS_Year refers to the year the IRS converted the submission into digitized form. Note that organizations can choose to adhere to either a calendar or a fiscal year. TaxPeriodBeginDt and TaxPeriodEndDt refer to the beginning and end dates of that filing’s year.

The location of a given digitized filing can be found with the get_aws_url function. This function returns the location of the XML file in the AWS repository.

# this organization's 2018 990 filing can be found here
aws_url <- get_aws_url("061553389", 2018)

Get the XML file for a given AWS URL

To parse fields from filings, first load an XML file of interest. Note that because this function calls get_aws_url(), there is no need to find the URL location. The first step for most uses of this package is get_990().

## load an XML for this organization's 2018 filing.
xml_root <- get_990("221405099", year = 2018)

## see name of this organization
organization_name <- get_organization_name_990("221405099")

Get a specific field from a filing

Nonprofit organizations can file different versions of the 990 form depending on their specific status. Knowledge of the specific form type is not required to extract values with this package, but the type of a given form can be seen with the following:

## see form type of a filing
filing_type <- get_filing_type_990(xml_root)

Available parsed fields can be seen in the irs_fields table. The package_variable column lists the variable names to be used in functions. Other columns show both the XML path and the physical form location for that variable. These variables are grouped by category and subcategory.

## see available variables
names(irs_fields)

## get total revenue for this org
revenue_total <- get_single_value_990(xml_root, "revenue_total")

Organizations report related entities on Form 990 Schedule R. The EINs of related organizations can be found with get_scheduleR().

## see related EINs for a given EIN and given tax year filing.
related_eins <- get_scheduleR("061553389", 2018)

This function returns multiple values if multiple related organizations are reported and indicates if no related organizations are reported.

Extract mission and description texts

Organizations report descriptive information about their primary mission and main activities in 990 filings. The concatenated responses to these fields can be extracted.

# mission statement
mission_desc <- get_value_990(xml_root, "mission_desc")
# program description
program_desc <- get_value_990(xml_root, "program_desc")