In this lesson, you learn the steps for using existing open code in your work. These steps include discovering, assessing, reusing, citing, and acknowledging.
After completing this lesson, you should be able to:
Many people discover code through discussions with their colleagues or by reading journal articles and attending talks at conferences. This is a great way to find out about code that might have applications for your scientific problem.
What other ways can someone search for open code? As a first step, look for code that already exists because chances are that someone else has already had a similar problem and published their code online. A common way to search for existing code is with a general search engine. Search engines offer one indicator of a code’s relevancy, how recently it was updated, and how frequently others reference it.
Example | I’m a new graduate student starting to work on modeling turbulence in the Southern Ocean to better understand sea surface temperature (or ocean heat uptake) and climate change. Is there some software available to model how eddies in the ocean affect sea-surface temperature? |
Exercise | General Search on the term “Software for ocean turbulence modeling” |
Result | General Ocean Turbulence Model (GOTM) |
This successful search is predicated on the developers of GOTM making their code open.
Discovering open software depends on developers making their software easy to find. The Findable, Accessible, Interoperable and Reusable (FAIR) Principles for research software suggest:
Reference: “The FAIR Guiding Principles for scientific data management and stewardship” Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). See also Module 1.
However, you may have more specific needs. The following sections cover additional ways to help discover relevant software that meets specific research demands.
A successful search for open code demands a clearly defined purpose. Developers must first determine the tasks they expect their code to carry out. The requirements associated with these tasks can determine the best suited programming language.
Next, familiarize yourself with the terminology of others who created open software with similar requirements to your own. The keywords affiliated with your programming purpose or requirements can serve as a starting point when searching for relevant code. These keywords can be found in community forums about open source programming and in related scientific journal articles. With adoption of open access principles by many academic journals, prospective programmers can peruse scientific papers from fields related to their research in order to find, and sometimes make use of, existing code that will fulfill their requirements.
The open software ecosystem is vast, organic, multifaceted, and highly distributed.
If you are looking for scientific software, community standards increasingly require code to be published and linked to scientific papers.
Thus, the scientific literature and its ancillary code archives are increasingly a great place to look for scientific open code.
Most open code is not developed by or for scientists. However, open code enables research every day.
There are several popular search engines for code snippets. First, you can simply search on Google. Other commonly used search engines include GitHub Code Search and Stack Overflow. These search engines allow you to search for specific code snippets by programming language, keyword, or other criteria. GitHub Code Search allows you to search GitHub, a popular code repository for scientific software. Stack Overflow allows you to search forums, where users discuss solutions to coding problems.
![]() |
![]() |
![]() |
GitHub | GitLab | Bitbucket |
Example - GitHub Code Search
In this example, we will practice searching for open access code on GitHub. Let’s work through a scenario in which you would like to search for the Lomb and Scargle method for estimating a power spectrum.
Example background
GitHub enables users to collaborate on a shared project and track their changes with version control. Users can create a repository and grant others access, or make it open access. GitHub involves a large community of open access users who make their code available for free.
Example instruction
Begin by visiting the GitHub website to search for openly available software packages. You will need to create a free account for this action. Navigate to the Search Code page to begin your search and access tutorials on the interface and capabilities of the search portal. Alternatively, you can simply input your search terms in the search bar while on your profile page. Next, input the related keywords into the search bar. Search for “Lomb Scargle” and find several repositories with relevant code in various languages, along with thousands of related snippets of code. Congratulations! You have begun your open access software journey and can now view the work of thousands of others who once were where you are now. Upwards and onwards!
Screenshot of the repositories returned from our search
Screenshot of the code snippets returned from our search
With open software, knowing where to search and what to search for can be a challenging problem. You can always start with a Google Search. However, it can be valuable to think through some of the questions that guide the discovery process. If the user lacks relevant experience, it can also be helpful to engage experienced colleagues at this stage.
Review the flow chart that illustrates how the search follows the definition of the need.
A software repository is an online collection of stand-alone application software packages. Repositories typically control access and track the deployments/downloads of packages.
Software packages are often provided as executables without code.
The collection typically includes metadata, documentation, and licensing restrictions on each package. It may include different software package versions and the platforms or environments on which the software package can be executed.
Most research code should be open source software, which is stored in code repositories.
![]() |
![]() |
Software Heritage | Open Source Development Network (OSDN) |
![]() |
![]() |
SourceForge | Free and Open-Source Software Hub (FOSSHUB) |
![]() |
![]() |
Googlecode | Comprehensive Perl Archive Network |
![]() |
![]() |
PyPl | CRAN |
NASA Resources for Discovering Open Software
These are a few links to NASA-specific repositories that may be of interest:
So, you’ve discovered some exciting open code that might help you solve your scientific problem. Can you trust this code you discovered on the web? Will it be useful? How much time will it take to learn it? Could the code contain malware? Could you get in legal trouble for using it?
Examples: You found the “General Ocean Turbulence Model (GOTM)” on the internet, and it looks promising. Or, you just found lots of code snippets and functions related to the Lomb-Scargle power spectrum. Now you would like to assess these pieces of code to help you decide if you should use them. This section discusses some best practices for assessing if the code will help you.
Software assessment criteria are similar, for any level of openness:
It can be easier to use coding languages that you are familiar with, then import the code into existing software rather than try to use a new language. On the other hand, the use of existing packages and executables can accelerate your work.
Read the README file. Does the software meet your functional requirements? Are the environmental dependencies well-defined and reasonable?
It is a good sign if you can find evidence that the code has been used successfully by other users that have similar scientific or technical needs.
To quickly assess the community usage and quality of software repository, use the tools from the repository where you found it. GitHub, for example, permits a quick scan of development activity as evidenced by the number of times the code has been downloaded or ‘forked’ in GitHub parlance. You can also view the amount of activity in a community. GitHub also provides insights into the quality of the software.
You have found some Open Code that will help you solve your scientific problem and it looks easy to use. However, you may still have some reservations. Perhaps you are unsure if the code poses a security risk, for example.
The risks are relatively low for small snippets of code that are easy for you to fully understand. However, you may not be able to fully understand all components of a large Open Software Package.
Open software is perceived to have more security risks. This is generally less of a problem for open source code than executables because the code can be audited for security vulnerabilities by the community. How can you assess security in this case?
So, you want to reuse some open code you discovered. It is essential to check the legal restrictions and requirements imposed on users, which are generally provided in the license.
Although licensing is a nuanced subject that you will learn more about in Lesson 3, it is useful to be aware that there are generally two classes of license: permissive and non-permissive. Permissive licenses, most commonly Apache 2.0, MIT, or BSD, will generally allow you to use the code for your scientific research with little restriction, whereas non-permissive licenses such as copy-left licenses, impose substantial restrictions on how you use the code and require more careful consideration.
Software can be reused in a variety of ways. A software package can be executed on its own to provide a complete analysis or models depending on the input parameters. Alternatively, the package could be imported as part of a larger library to provide specific functionality. Also, code snippets can be copied into existing code, if permitted, or the code could be re-written and incorporated into new software.
If you simply intend to reuse a code snippet, continuously test that your selected code works as you expect. If you are reusing a more complex code, there are additional considerations.
Consider the following when selecting among multiple versions of open source software.
Use the latest stable release when possible | Just like software updates to your phone or computer’s operating system or apps, it is important to use the latest stable release. Developers often release developmental versions that include new features or bug fixes that are not fully tested. For this reason, using a developmental release is generally not recommended. |
Determine the origin of the version you intend to use | Determine whether the version you intend to use comes from a modified open-source project or from its original source project. With this information, determine which source is more appropriate for your project. |
Check for issues and bugs | Check for any known issues or bugs with your selected version that could cause problems. Find current information on issues or bugs by checking release notes, issue trackers, and developer forums. |
In this activity, you are asked to select from a list of ways you can resolve some common problems that arise when using open software.
Select how you can resolve this problem when using open software: Difficulty finding open software that meets your needs.
Select all that apply.
Select how you can resolve the problem when using open software: installation difficulties.
Select all that apply.
Select how you can resolve the problem when using open software: software is not working as expected.
Select all that apply.
After answering the questions above, work through some specific examples of how you would resolve problems on your own. For example, navigate to the astropy code repository on GitHub or another repository of your choice, and find the README and LICENSE files. Determine how you would contact the developers for help, etc.
Imagine that you’ve used Open Code pulled from the web and it made a big difference for your project research paper. How should you provide due credit for the open access code that contributed to your research?
Example: You managed to implement GOTM to learn something new about ocean turbulence in the Southern Ocean, or you managed to compute a Lomb-Scargle periodogram using astropy. Here are some questions to consider:
Cite any code that you view as having contributed to your research:
In most cases, a code snippet on Stack Overflow does not constitute a citable research contribution. However, an author can still decide to cite it if they chose.
Instances when shared code directly impacts the scientific results and requires a detailed description include:
See the journal where you are publishing if they have any specific instructions on how to cite software (e.g., AAS Software Citation Suggestions).
In some cases, a software’s licensing terms and conditions require acknowledgement or citation in the references or bibliography of any publications based on research that made use of the software.
Ideally, use and cite code that is archived in a long-term repository with a persistent DOI. Follow the guidance about the preferred citation format, which is provided in the long- term repository and may appear in a README or a CITATION file.
DOIs provide a persistent identifier/link for research outputs. Thus, it is preferable to cite code in long-term repositories linked to a DOI. URLs (e.g., Stack Overflow) and active repositories (e.g., on GitHub) are mutable but can be used if there is no alternative.
Packages may provide a way to cite individual versions as well. For reproducibility, cite both the overall package and the version that is used in your work. As functionality of a package may evolve with the release of new versions, this helps provide a specific description of your work.
If you are writing software, you can also cite in the comments and documentation of the software that you have used.
In this lesson, you learned that:
Answer the following questions to test what you have learned so far.
Question
01/03
Discovering open software successfully depends on which of the following:
Select all that apply.
Question
02/03
Read the statement and decide whether it’s true or false:
It is best to reach out to the developers of open access software via private communication if you run into problems.
Question
03/03
When citing Open Code, it is best practice to cite: