By the end of this lesson, you will be familiar with resources for open results utilization, how and when to cite the sources of the open results that you use, how to provide feedback to open results providers, and how to determine when it is appropriate to invite authors of the open results materials to be formal collaborators versus simply citing those resources in your work.
Published articles, blog posts, and forums can lead to new ideas for your own research. A technique learned from social media can be applied to a use-case that you are trying to solve. There are many different ways to discover results.
After completing this lesson, you should be able to:
How do I learn about the state of research for a particular field? How do you engage in the current conversation? Researchers often begin with a search of peer-reviewed articles. This review tells you how much research has been done in a field and what conclusions have recently been reached. In most fields, going through the peer-review process can take up to a year. The ability to find pre-prints can help reduce this delay because they offer the latest findings before a publication date. However, researchers who choose to share their results before publication typically do so in the ways listed as best practices above. As you start research on a topic, how do you find all these different types of results and engage in the most relevant research?
The various stages of research, from conceptualization to dissemination of results, produce products that can be put into the public domain as “Open Results”. Where these results are archived, and to what degree, depends on the discipline author. However, some general guidelines on where to start a search on open results include:
Scholarly Search Portals
Search engines like Google and Bing have radically changed how we look up information. For research results, specialized academic search engines and portals curate scientific results from researchers based on topic and field. These engines are useful for finding peer-reviewed articles.
GENERIC ☑ | DISCIPLINE-SPECIFIC |
---|---|
|
GENERIC | DISCIPLINE-SPECIFIC ☑ |
---|---|
|
Publications that provide some levels of open access are tracked in the Directory of Open Access Journals (DOAJ).
Web Searches
Open results include much more than open-access peer-reviewed publications. How do you find these alternative types of research objects?
Open communities and forums offer the best way to find research objects other than complete publications. How do you even find out whether these exist and where they are?
Once you have found a few peer-reviewed articles that are highly relevant, to find additional research objects, you can follow the authors on social media for links to their posts, blogs, and activities. There are open communities in almost every area of research - find yours! Here are different platforms to locate these conversations and resources:
Various research objects, including datasets and software, are frequently attached to scholarly publications in the form of supplemental material. At other times, the source is referenced in the paper, which could be a GitHub repository, personal/institutional website, or other storage site. This can be another starting point, by engaging in discussions on the GitHub repository.
Kerchunk Example: In lesson 1, a blog post about a software library ‘kerchunk’ was presented. Let’s look at a post on the Pangeo Discourse Forum of Kerchunk with a large number of views. The open science Pangeo project worked completely in the open. The project website (run off of GitHub) has links to blog posts, a discussion forum, and a calendar to all their meetings which anyone was welcome to join. This has resulted in an engaged and dynamic community. An example of this comes from the post linked to above, where one person asks for help, others reply, and the conversation is documented in the open. The post’s 636 views indicate that this question, or one similar, has occurred to others. Imagine if this had been done over private email? By working in the open, they are improving science and helping everyone become faster and more accurate.
“Garbage in, garbage out” – your own research products are only as good as the data used in your investigation.
If you use poor quality data or materials from unreliable and unvetted sources as critical components of your research, you run the risk of producing flawed, or low-quality science that may harm your reputation as a scientist. Therefore, it is critical to assess the quality and reliability of open-results sources before you include them in your own work.
What are best practices for assessing the quality of alternative sources of data to research articles such as blog posts, youtube videos, and other research objects?
Let’s take a look at the questions you might consider asking yourself when determining the reliability of any type of open results source.
Here, we list questions under two categories: the open results material themselves, and the server they are downloaded from. The more questions here that can be answered in the affirmative, the lower the risk in utilizing the open results materials for your own research.
THE MATERIAL ITSELF ☑ | THE ASSOCIATED WEBSITE / SERVER | SOURCE RELIABILITY INDICATORS |
---|---|---|
|
THE MATERIAL ITSELF | THE ASSOCIATED WEBSITE / SERVER ☑ | SOURCE RELIABILITY INDICATORS |
---|---|---|
|
THE MATERIAL ITSELF | THE ASSOCIATED WEBSITE / SERVER | SOURCE RELIABILITY INDICATORS ☑ |
---|---|---|
|
Adapted from https://www.scribbr.com/working-with-sources/credible-sources/
Note that failure to meet one or many of the criteria does not automatically mean that the open results are of poor quality, but rather that more caution should be exercised if incorporated into your own research. It also means that you will have to invest more personal vetting of the material to ensure its quality is sufficient for your purposes.
Reliable Example: Qiusheng Wu YouTube videos (as mentioned in the previous lesson). Professor Wu is an expert in his field. He presents results along with notebooks that demonstrate reproducibility. Comments on his YouTube tutorial videos represent meaningful interactions between users reproducing results and the author.
While open results benefit science and have already provided valuable societal benefits, the misuse and incautious sharing of open materials can have far-reaching harmful effects. The end-user of open results bears the responsibility to ensure that the data they reference are used in a responsible manner and that any relevant guidelines for the use of the data are followed.
Contributing to and providing constructive feedback are vital components for a healthy open access ecosystem, ensuring long-term sustainability of the open resources by providing continual improvements and capability expansions.
In our current system, there are results creators and consumers. This scenario presents a one way street with no feedback loop, no sharing of data back to publishers, and no sharing between intermediaries.
The practice of producing open results aims to foster a system where feedback loops exist between users and makers. Users share their cleaned, integrated, or improved work to the maker. This feedback creates a symbiotic and sustainable process where everyone benefits.
![]() |
Pro: The feedback is open and other community members can see ongoing issues that are being addressed. |
![]() |
Pro: Contribution is archived and logged on GitHub. |
Working with GitHub Issues
See this blog for general issue etiquette
![]() |
Con: the feedback is closed. The information is generally not propagated back to the community unless the creator creates a new version. |
![]() |
Con: No way of tracking credit. |
If your feedback results in a substantial intellectual contribution to the work, it is reasonable for you to expect an opportunity for co-authorship in a future version of the open result. The associated contribution guidelines should address this possibility and manage expectations prior to your providing feedback.
Sadly, many times contributor guidelines do not exist and it is not clear what is “substantial”.
Additionally, give credit to repositories that provide open source materials in the acknowledgement section of your paper. If the repository provides an acknowledgments template in their “About” link, follow that suggestion. Otherwise, a generic “This research has made use of <insert repository name>.” will be sufficient.
Standard guidelines that you’ve been using in your research all along for providing appropriate attribution and citations of closed access publications also apply to open access published works.
Examples of plagiarism include:
Here is a useful guide regarding the different forms of plagiarism
Giving proper attribution to open results is an important and ethical responsibility for using open a source materials. The process for citation is specific to the nature of the material.
If a paper has been formally published in a journal, then your citation should point to the published version rather than to a preprint server.
Take the time to locate the originating journal to provide an accurate citation.
Preprint Server (Cite only if journal publication not available)
Source Publication (Always cite)
If a paper that you wish to cite is not yet accepted for publication, you should follow the guidelines of the journal to which you are submitting your paper. A preprint reference citation typically includes author name(s), date of the most recent version posted, paper title, name of the preprint server, object type (“preprint”), and the DOI.
At the time of the Lesson preparation, the following paper did not yet appear as a journal publication.
Jin, H., et al. 2023, “Optical color of Type Ib and Ic supernovae and implications for their progenitors,” ApJ, preprint, arXiv:2304.10670.
FOR MATERIAL THAT HAS A DOI ☑ | FOR MATERIAL THAT DOES NOT HAVE A DOI | FOR OTHER MATERIALS OR INTERACTIONS THAT WERE HELPFUL FOR YOUR RESEARCH |
---|---|---|
To cite all of the following, follow existing guidelines and community best practices:
|
FOR MATERIAL THAT HAS A DOI | FOR MATERIAL THAT DOES NOT HAVE A DOI ☑ | FOR OTHER MATERIALS OR INTERACTIONS THAT WERE HELPFUL FOR YOUR RESEARCH |
---|---|---|
Examples include blog posts, videos, and notebooks.
|
FOR MATERIAL THAT HAS A DOI | FOR MATERIAL THAT DOES NOT HAVE A DOI | FOR OTHER MATERIALS OR INTERACTIONS THAT WERE HELPFUL FOR YOUR RESEARCH ☑ |
---|---|---|
|
In the Lesson 1 blog post example, researchers acknowledged people they worked with in an article they wrote that they found helpful, and two different communities, as well as the computational environment they worked on. This is a great example of giving credit: “I would like to thank Rich Signell (USGS) and Martin Durant (Anaconda) for their help in learning this process. If you’re interested in seeing more detail on how this works, I recommend Rich’s article from 2020 on the topic. I would also like to recognize Pangeo and Pangeo-forge who work hard to make working with big data in geoscience as easy as possible. Work on this project was done on the Pangeo AWS deployment.”
In Lesson 1, the JWST case study was presented. The peer-reviewed publication that reported the first discovery of CO2 on another planet has been accessed 18,000+ times. Notice is that the authorship is attributed to the entire team. The Acknowledgements section duly explains the contributions of their collaborators and partners, “The results reported herein benefited during the design phase from collaborations and/or information exchange within NASA’s Nexus for Exoplanet System Science (NExSS) research coordination network sponsored by NASA’s Science Mission Directorate.” Also, “All the data and models presented in this publication can be found at https://doi.org/10.5281/zenodo.6959427”. And finally, they cite all the software! “The codes used in this publication to extract, reduce and analyze the data are as follows..”
In this lesson, you learned:
Answer the following questions to test what you have learned so far.
Question
01/02
Which of the following could be a source of open results? Select all that apply.
Question
02/02
Which of the following characteristics suggest that a particular paper / data set is more likely to be a credible Open Result? Select all that apply.