Skip to content

Commit

Permalink
feat(portal): updated portal loader and added demo dataset (#4771)
Browse files Browse the repository at this point in the history
* feat: added demo dataset for the portal model
  • Loading branch information
davidruvolo51 authored Mar 5, 2025
1 parent abc9ea4 commit 5c8e735
Show file tree
Hide file tree
Showing 18 changed files with 604 additions and 29 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,12 @@ public void run() {
List<Row> rows = getProfilesFromAllModels("/portal", List.of());
getSchema().migrate(Emx2.fromRowList(rows));
MolgenisIO.fromClasspathDirectory("/_ontologies", getSchema(), false);
MolgenisIO.fromClasspathDirectory("/_settings/portal", getSchema(), false);

if (isIncludeDemoData()) {
MolgenisIO.fromClasspathDirectory("/_demodata/applications/portal", getSchema(), false);
}

this.complete();
} catch (Exception e) {
this.completeWithError(e.getMessage());
Expand Down
19 changes: 19 additions & 0 deletions data/_demodata/applications/portal/Biosamples.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
id,collected from individual,included in datasets.resource,included in datasets.name
SAMEA10450711,Case1F,EGAS00001005702,EGAD00001008392
SAMEA10450712,Case1C,EGAS00001005702,EGAD00001008392
SAMEA10450713,Case1M,EGAS00001005702,EGAD00001008392
SAMEA10450714,Case2C,EGAS00001005702,EGAD00001008392
SAMEA10450715,Case2F,EGAS00001005702,EGAD00001008392
SAMEA10450716,Case2M,EGAS00001005702,EGAD00001008392
SAMEA10450723,Case3C,EGAS00001005702,EGAD00001008392
SAMEA10450724,Case3F,EGAS00001005702,EGAD00001008392
SAMEA10450725,Case3M,EGAS00001005702,EGAD00001008392
SAMEA10450726,Case4C,EGAS00001005702,EGAD00001008392
SAMEA10450728,Case4F,EGAS00001005702,EGAD00001008392
SAMEA10450729,Case4M,EGAS00001005702,EGAD00001008392
SAMEA10450730,Case5C,EGAS00001005702,EGAD00001008392
SAMEA10450731,Case5M,EGAS00001005702,EGAD00001008392
SAMEA10450732,Case5F,EGAS00001005702,EGAD00001008392
SAMEA10450733,Case6C,EGAS00001005702,EGAD00001008392
SAMEA10450734,Case6F,EGAS00001005702,EGAD00001008392
SAMEA10450735,Case6M,EGAS00001005702,EGAD00001008392
9 changes: 9 additions & 0 deletions data/_demodata/applications/portal/Clinical observations.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
id,individual,date saved in database
7qOBAUD680,Case1C,2025-02-25
7qOBAUD681,Case2C,2025-02-25
7qOBAUD682,Case2M,2025-02-25
7qOBAUD683,Case3C,2025-02-25
7qOBAUD684,Case4C,2025-02-25
7qOBAUD685,Case5C,2025-02-25
7qOBAUD686,Case5M,2025-02-25
7qOBAUD687,Case6C,2025-02-25
2 changes: 2 additions & 0 deletions data/_demodata/applications/portal/Datasets.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
resource,name,label,since version,date created,dataset type,description,unit of observation,keywords
EGAS00001005702,EGAD00001008392 ,Rare Disease Synthetic Dataset,2001-01-01,2025-02-27,Collected dataset,"The purpose of this project is to provide public human datasets for the study of rare diseases. The use of public human genomic background combined with the in-silico insertion of real disease-causing variants enable to have a representative dataset for testing purposes without facing ethical and legal issues associated with the use of human sensitive data. This project aims to help development of technical implementations for rare disease data integration, analysis, discovery, and federated access.",sample,"Sex/gender,Family and household structure,Health-related characteristics,Diseases"
6 changes: 6 additions & 0 deletions data/_demodata/applications/portal/Disease history.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
part of clinical observation,disease
7qOBAUD680,Central core disease
7qOBAUD684,Mitochondrial DNA depletion syndrome
7qOBAUD687,Central core disease
7qOBAUD685,Hereditary breast cancer
7qOBAUD686,Hereditary breast cancer
428 changes: 428 additions & 0 deletions data/_demodata/applications/portal/Files.csv

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions data/_demodata/applications/portal/Gender at birth.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
name,definition,codesystem,code,ontologyTermURI
assigned U at birth,An assignment of an 'U' gender shortly after birth. Note that 'U' is only available in certain jurisdictions on birth certificates.,GSSO,9511,http://purl.obolibrary.org/obo/GSSO_009511
assigned X at birth,An assignment of an 'X' gender shortly after birth. Note that 'X' is only available in certain jurisdictions on birth certificates.,GSSO,9486,http://purl.obolibrary.org/obo/GSSO_009486
assigned diverse at birth,"An assignment of a diverse gender marker shortly after birth. Note that 'diverse' is only available in certain jurisdictions on birth certificates. Further, note that 'diverse' itself is not a gender, but is a legal gender in this context.",GSSO,9512,http://purl.obolibrary.org/obo/GSSO_009512
assigned eunuch at birth,"An assignment of a 'eunuch' gender marker shortly after birth. Note that 'eunuch' is only available in certain jurisdictions on birth certificates. Further, note that 'eunuch' itself is not a gender, but is a legal gender in this context. This term may also be offensive depending on the context.",GSSO,9513,http://purl.obolibrary.org/obo/GSSO_009513
assigned female at birth,An assignment of a female gender shortly after birth.,GSSO,123,http://purl.obolibrary.org/obo/GSSO_000123
assigned indeterminate at birth,"An assignment of an indeterminate gender marker shortly after birth. Note that 'indeterminate' is only available in certain jurisdictions on birth certificates. Further, note that 'indeterminate' itself is not a gender, but is a legal gender in this context.",GSSO,9509,http://purl.obolibrary.org/obo/GSSO_009509
assigned intersex at birth,"An assignment of an intersex gender marker shortly after birth. Note that 'intersex' is only available in certain jurisdictions on birth certificates. Further, note that 'intersex' itself is not a gender, but is a legal gender in this context.",GSSO,9488,http://purl.obolibrary.org/obo/GSSO_009488
assigned male at birth,An assignment of a male gender shortly after birth.,GSSO,124,http://purl.obolibrary.org/obo/GSSO_000124
assigned no gender at birth,"Use in situations where no gender was assigned at birth, i.e. no birth certificate was issued shortly after birth or no birth certificate gender was recorded.",GSSO,9487,http://purl.obolibrary.org/obo/GSSO_009487
assigned nonbinary at birth,An assignment of a nonbinary gender shortly after birth. Note that 'nonbinary' is only available in certain jurisdictions on birth certificates.,GSSO,9510,http://purl.obolibrary.org/obo/GSSO_009510
assigned other at birth,An assignment of an 'other' gender marker shortly after birth. Note that 'other' is only available in certain jurisdictions on birth certificates.,GSSO,9507,http://purl.obolibrary.org/obo/GSSO_009507
assigned third gender at birth,"An assignment of a 'third gender' gender marker shortly after birth. Note that 'third gender' is only available in certain jurisdictions on birth certificates. Further, note that 'third gender' itself is not a gender, but is a legal gender in this context. This term may also be offensive depending on the context.",GSSO,9514,http://purl.obolibrary.org/obo/GSSO_009514
assigned unspecified at birth,An assignment of an 'unspecified' gender shortly after birth. Note that 'unspecified' is only available in certain jurisdictions on birth certificates.,GSSO,9515,http://purl.obolibrary.org/obo/GSSO_009515
19 changes: 19 additions & 0 deletions data/_demodata/applications/portal/Individuals.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
id,gender at birth,pedigree,included in datasets.resource,included in datasets.name
Case1F,assigned male at birth,Case1,EGAS00001005702,EGAD00001008392
Case1C,assigned male at birth,Case1,EGAS00001005702,EGAD00001008392
Case1M,assigned female at birth,Case1,EGAS00001005702,EGAD00001008392
Case2C,assigned male at birth,Case2,EGAS00001005702,EGAD00001008392
Case2F,assigned male at birth,Case2,EGAS00001005702,EGAD00001008392
Case2M,assigned female at birth,Case2,EGAS00001005702,EGAD00001008392
Case3C,assigned male at birth,Case3,EGAS00001005702,EGAD00001008392
Case3F,assigned male at birth,Case3,EGAS00001005702,EGAD00001008392
Case3M,assigned female at birth,Case3,EGAS00001005702,EGAD00001008392
Case4C,assigned male at birth,Case4,EGAS00001005702,EGAD00001008392
Case4F,assigned male at birth,Case4,EGAS00001005702,EGAD00001008392
Case4M,assigned female at birth,Case4,EGAS00001005702,EGAD00001008392
Case5C,assigned female at birth,Case5,EGAS00001005702,EGAD00001008392
Case5M,assigned female at birth,Case5,EGAS00001005702,EGAD00001008392
Case5F,assigned male at birth,Case5,EGAS00001005702,EGAD00001008392
Case6C,assigned male at birth,Case6,EGAS00001005702,EGAD00001008392
Case6F,assigned male at birth,Case6,EGAS00001005702,EGAD00001008392
Case6M,assigned female at birth,Case6,EGAS00001005702,EGAD00001008392
19 changes: 19 additions & 0 deletions data/_demodata/applications/portal/Pedigree members.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
pedigree,individual,relative,relation
Case1,Case1C,Case1C,Patient
Case1,Case1F,Case1C,Biological Father
Case1,Case1M,Case1C,Biological Mother
Case2,Case2C,Case2C,Patient
Case2,Case2F,Case2C,Biological Father
Case2,Case2M,Case2C,Biological Mother
Case3,Case3C,Case3C,Patient
Case3,Case3F,Case3C,Biological Father
Case3,Case3M,Case3C,Biological Mother
Case4,Case4C,Case4C,Patient
Case4,Case4F,Case4C,Biological Father
Case4,Case4M,Case4C,Biological Mother
Case5,Case5C,Case5C,Patient
Case5,Case5F,Case5C,Biological Father
Case5,Case5M,Case5C,Biological Mother
Case6,Case6C,Case6C,Patient
Case6,Case6F,Case6C,Biological Father
Case6,Case6M,Case6C,Biological Mother
7 changes: 7 additions & 0 deletions data/_demodata/applications/portal/Pedigree.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
identifier
Case1
Case2
Case3
Case4
Case5
Case6
4 changes: 4 additions & 0 deletions data/_demodata/applications/portal/Phenotype observations.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
part of clinical observation,type
7qOBAUD681,Macular dystrophy
7qOBAUD682,Macular dystrophy
7qOBAUD683,Limb-girdle muscular dystrophy
2 changes: 2 additions & 0 deletions data/_demodata/applications/portal/Resources.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
id,name,website,type,description,cohort type,clinical study type,start year,status,number of participants,number of participants with samples,release type
EGAS00001005702,Human genomic and phenotypic synthetic data for the study of rare diseases,https://ega-archive.org/studies/EGAS00001005702,Rare disease,"The purpose of this dataset is to facilitate development of technical implementations for rare disease data integration, analysis, discovery, and federated access. This synthetic dataset includes clinical and genomic data from 6 rare disease cases. It consists of 18 whole genomes (6 index cases with their parents) which have genetic background based on public human data sequenced in the context of the Illumina Platinum initiative (Eberle, MA et al. (2017)) and made available by the HapMap project (https://www.genome.gov/10001688/international-hapmap-project). In each of the cases, real causative variants correlating with the phenotypic data provided were spiked-in. The cases included in this synthetic dataset correspond to the following type of disorders: CASE 1- Congenital myasthenic syndrome (Autosomal Dominant -de novo variant), CASE 2- Macular dystrophy (Autosomal Dominant), CASE 3- Muscular dystrophy (Autosomal Recessive-compound heterozygous variants), CASE 4- Mitochondrial disorder (Autosomal Recessive-consanguineous case - homozygous variant), CASE 5- Breast cancer (Autosomal Dominant), CASE 6- Similar as case 1 for patient matchmaking tests: Congenital myasthenic syndrome (Autosomal Dominant-de novo variant) For each case you will be able to download the following data: clinical information (phenopackets per individual and pedigree per family), raw genomic data (FASTQ and BAMs) and processed genomic data (vcfs). When using the data, the following should be acknowledged: the RD-Connect GPAP (https://platform.rd-connect.eu/), EC H2020 project EJP-RD (grant # 825575), EC H2020 project B1MG (grant # 951724) and Generalitat de Catalunya VEIS project (grant # 001-P-001647).",Other type,Primary data collection,2021,Finalised,6,6,Closed dataset
Loading

0 comments on commit 5c8e735

Please sign in to comment.