Wikidata to build 5-star Linked Open biological databases: A case study of PanglaoDB
Research compendium for the project "Wikidata to build 5-star Linked Open biological databases: A case study of PanglaoDB".
Repository brief descrition
Research-related directories:
-
analysis: Scripts and notebooks used for the main analysis. It also includes the subdirectories:
- data: Raw data from PanglaoDB and Wikidata is stored here.
- results: Processed data is stored here.
-
manuscripts: Manuscripts for this research project, each manuscript is a submodule of a GitHub repository that uses Manubot.
-
improvements: One-use code, creating Wikidata items from PanglaoDB's metadata and improving existing items. Subdiretories:
- go2cell: Prototype of a shiny app to match Gene Ontology terms to cell types. As of April, 2024, it's running here.
Software-related directories, they are structured similarly to a Python package:
- wikidata_panglaodb: This is the source code for all author-defined functions used in the analysis.
- tests: These are the unit tests for the wikidata_panglaodb "package" functions.
- docs: This is a directory containing documentation for the wikidata_panglaodb functions, it is served as a live website in our github pages branch.
Reproducing and developing
Reproducing the analyses
Pre-requisites:
- Python>=3.7
- A unix based terminal interface.
Download the repository's zip file or clone it using:
git clone --recurse-submodules https://github.com/jvfe/wikidata_panglaodb
Then, at the project's root directory (wikidata_panglaodb/):
make repro
This will reproduce all steps of the analysis done after the reconciliation.
Collaborating
Pre-requisites:
- Git
- A unix based terminal interface
- Conda
Initiate the environment:
make develop
If you've already collaborated before but changes have been made to the conda enviroment/repository, run:
make update-proj