Open-Source Software and Code Repositories
On this page:
Open-Source Software
Open-Source Software (also known as open-source code) enables EPA development teams to easily collaborate with external developers, researchers, and the public who are interested in contributing to EPA projects.
Open-source code development is invaluable as the Agency develops new features, builds improvements, and fixes known and unknown issues. External contributors who share a common interest in a project can analyze the source code and verify that the software is correct, which helps support the validity of claims made in accompanying scientific papers or publications. Following the EPA established standardization specifications and code requirements is critical when using open-source code because it must be easy to read and understand for all collaborators. For information on Open-Source concepts and practices, visit OpenSource.org.
Effective August 8, 2016, the OMB Mandate: M-16-21; Federal Source Code Policy: Achieving Efficiency, Transparency, and Innovation through Reusable and Open Source Software applies to new custom-developed code created or procured by EPA.
To implement the mandate within EPA projects, EPA or contractors developing on their behalf will:
- Share custom-developed code that EPA develops or procures for broad reuse across the federal government, subject to limited exceptions.
- Release at least 20 percent of new custom-developed code to the public as OSS under a three-year pilot program.
- Maintain an Agency-wide posture of being “open first,” meaning it is EPA’s primary choice to develop or acquire custom-developed code that is broadly available to the public for inspection, improvement, and reuse.
- Ensure program offices follow OMB’s three-step software analysis outlined in M-16-21 and include contract requirements for open-source code requirements when applicable. This three-step analysis first leverages existing federal solutions, followed by existing commercial solutions and finally custom development. Contracts for custom-developed code must also acquire and enforce rights sufficient to enable government-wide reuse of custom-developed code.
- Update EPA’s IT acquisition processes to support and implement an OSS approach, as identified by the Chief Administrative Officer and Chief Information Officer.
- Establish an inventory of new custom-developed code and provide this inventory and associated metadata established by OMB for each project’s source code to appropriate repositories, including Code.gov.
- Use EPA’s standard version control system(s) allowing for compliance, as identified by the Chief Technology Officer.
- Apply EPA’s open source and rights license guidance for custom-developed code, government reuse and OSS application, as identified by M-16-21 in consultation with the Office of General Counsel.
- Release open-source code through a public-facing software version control platform.
- Provide the metadata that will be included in EPA’s code inventory.
- Publish the OSS inventory metadata file in JSON format.
The EPA specific implementation of OMB Mandate M-16-21 is detailed in the System Life Cycle Management Procedure. EPA has chosen to use GitHub as its version control system as well as its inventory of open-source code projects. EPA uses GitHub to inventory its custom-developed, open-source code and generate the necessary metadata file that is then posted to code.gov for broad reuse.
Code Repositories
EPA uses several code repositories for different uses. Bitbucket and GitLab are primarily used by EPA staff. The Federal Project Lead can provide additional details regarding the use of Bitbucket or GitLab if necessary. GitHub is the repository used most for collaboration with external stakeholders and the following information only relates to GitHub.
EPA uses GitHub Enterprise Cloud, which includes an unlimited number of public and private repositories, though there are restrictions on the amount of activity within private repositories. If access to the EPA GitHub Enterprise system is needed, the Federal Project Lead can coordinate user account access with the GitHub Org Owner for their division or office.
Repository Management Prerequisites
GitHub requires a user license for access to the features not available in the free version of GitHub. Primarily licenses are required for users who need administrative access, need to manage a repository through the permission set 'maintain', or use certain features. The Federal Project Lead will determine whether a license is required.
External collaborators do not need a license if they only need to view and contribute to public EPA repositories.
Creating an EPA Code Repository
Licensed users can create private repositories. When creating a repository, the creator assumes the role of Repository Administrator. With these permissions, the admin can add and remove collaborators and licensed users from the related organization. See GitHub's Repository Roles in an Organization to better understand the role of the admin and other roles the admin can assign within the repo.
Repository creation requires the selection of an open-source license and the creation of several supporting files. These requirements are detailed below. After creation, management and maintenance tasks should be completed as necessary.
Select an Open-Source License
Selecting the appropriate Open-Source Code License is a critical step because licenses are permanent once applied to a repository. EPA usually recommends the use of the MIT license for our code. However, the Federal Project Lead will select the most appropriate open-source license from those available for EPA open-source software projects.
Repository Naming Conventions
New repositories should follow the EPA's naming conventions for repositories. Repositories should be all lowercase, separated with hyphens, unless the title is an acronym that can be written in all capital letters. Examples:
- github.com/USEPA/example-repo-name.
- github.com/USEPA/CODE.
Upload Files to Repository
Before code development begins, there are some files that need to be added to a repository to let people know what the code does and how to interact with it. Below are the various artifacts required in the code repository. Visit EPA's Open-Source Repository Example on GitHub to reuse and modify the files as needed. These files are mandatory.
Artifacts
- Open-source license (file name: LICENSE.md)
- Readme (file name: README.md)
- Create a README file for the GitHub repository to describe why the project is useful, what can be done with the project, and how it can be implemented. A README file typically includes:
- What the project does.
- Why the project is useful.
- How users can get started with the project.
- Where users can get help with the project.
- The following disclaimer must be included in the README file of public repositories:
- Disclaimer: The United States Environmental Protection Agency provides this code on an "as is" basis and the user assumes responsibility for its use. EPA has relinquished control of the information and no longer has responsibility to protect the integrity, confidentiality, or availability of the information. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by EPA. The EPA seal and logo shall not be used in any manner to imply endorsement of any commercial product or activity by EPA or the United States Government.
- Create a README file for the GitHub repository to describe why the project is useful, what can be done with the project, and how it can be implemented. A README file typically includes:
- Contributing Policy (file name: CONTRIBUTING.md)
- A contributing policy explains how external users can contribute and interact with the project. The policy will explain how contributors can submit an issue, request a change, and further submit an inquiry. Some information requested of contributors includes:
- What module or bug do they intend to address? What work do they intend to contribute?
- Are they comfortable with the development strategy including code consistency, benchmarking, configuration testing, compiler testing, model output validation, documentation, and merging? This may be applicable for applications and not possibly applicable for scripts.
- Are they able to provide ongoing support and technical guidance for their proposed contribution?
- Provide information on how contributors can maintain consistency when contributing to the code including a list of coding conventions and standards for this repository code.
- For additional reference, see this example CONTRIBUTING.md from GSA.
- A contributing policy explains how external users can contribute and interact with the project. The policy will explain how contributors can submit an issue, request a change, and further submit an inquiry. Some information requested of contributors includes:
- Metadata
- Federal open-source law requires that EPA post metadata on open-source repositories internally and to code.gov. EPA has a tool (utilizing GitHub Actions) to create the .JSON file needed to post to code.gov. To ensure accurate reporting, fill out all the metadata fields in the repository including:
- Description.
- Tags (if applicable).
- Contact.
- Status.
- Federal open-source law requires that EPA post metadata on open-source repositories internally and to code.gov. EPA has a tool (utilizing GitHub Actions) to create the .JSON file needed to post to code.gov. To ensure accurate reporting, fill out all the metadata fields in the repository including:
Repository Management
Once a Repository has been created and all the necessary files have been uploaded, the Repository Administrator will have several ongoing or occasional maintenance tasks that need to be completed. These tasks include managing user access, reviewing code, reviewing security logs, tracking action minutes usage, and communications with the Org Owner.
Manage User Access
Throughout the lifecycle of the repository, users will need to be added and removed. GitHub offers guidance on how to add and remove users of various access levels.
- Managing Individual Access to a Repository or Organization.
- Adding Outside Collaborators to a Repository.
Review Code
Review code contributions from external collaborators and integrate code commits into source code.
Repository Administrators will review all code submissions to ensure code stability and consistency and prevent degradation of code performance. After review, the admin will either accept the submission, recommend specific improvements to the submission, or in some cases reject the submission. To avoid outright rejection, developers contributing code should contact the repo administrator early in the development process and maintain contact throughout to help ensure the submission is compatible with the code base and is a robust addition.
GitHub Org Owner Communications
Several the actions related to repositories may need to be performed by the Org Owner who maintains elevated privileges. The Federal Project Lead should be able to communicate requests to the Org Owner. Some actions that may require assistance from an Org Owner include:
- Purchasing and assigning a new license.
- Assigning new privileges to users.
- Sharing a secret between repositories.
Public Web Content
In addition to the EPA security, privacy and accessibility requirements, all open-source code must be consistent with EPA’s Web Guide, Procedures: Ensuring EPA Public Content in the EPA Web Environment, EPA open source code and code-related documentation (e.g., README files, Metadata, Developer Documentation, etc.) must be in an approved EPA public repository (GitHub.com/USEPA) and be appropriate, following the standards linked above.