Feedback
You are welcome to submit feedback to add, augment, or refine terminology for the SPARC glossary.
Annotation
Addition of analysis information, knowledge or commentary to a data set or part of dataset. The identification of a signal in a micrograph as a mitochondria would be considered a type of annotation. To increase interoperability among data sets, SPARC encourages and enables annotation with the SPARC Vocabularies. These annotations on top of data then become part of the SPARC Knowledge Graph.
ApiNATOMY
An automated methodology that produces simple anatomy schematics in a consistent manner, and provides for the overlay of anatomy-related information onto the same diagram. ApiNATOMY draws upon the topology of anatomy ontology graphs to automatically lay out treemaps representing body parts as well as semantic metadata that link to such ontologies.
ApiNATOMY is used in SPARC to build routing and connectivity graphs for anatomical entities. Such graphs support queries that, for instance, identify neural connections that course through a tract, nerve or ganglion. ApiNATOMY-based knowledge allows the SPARC user to determine the nuclei/grey matter regions affected by the transection of a nerve or the stimulation of a ganglion.
In addition, the same routing information leveraged by the flatmap GUI may be used to discover metadata to SPARC experimental data or simulation models that locate along the route of a tract, nerve or ganglion. For an example of an ApiNATOMY map, see diagram on the SAWG page.
BIDS: Brain Imaging Data Structure
BIDS is an endorsed standard of the INCF to prescribe a formal way to name and organize neuroimaging data and metadata in a file system; simplifying communication and collaboration between users. BIDS also enables easier data validation and software development through using consistent paths and naming for data files. The SPARC Data Set Structure is based on BIDs.
Bioelectronic Medicine
Bioelectronic medicine, the convergence of molecular medicine, neuroscience, engineering and computing to develop devices to diagnose and treat diseases - Olafsson and Tracey, 2017.
CC BY 4.0
All SPARC public data will be released under the Creative Commons Attribution license. The terms of the license require that you must give appropriate credit to the provider, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
CellML
An XML format for encoding mathematical models in a shareable, modular, and reusable manner. A core standard of the COMBINE network. Primarily created, edited, and simulated using the OpenCOR software tool.
Computational Service
These are the principal building blocks for computational studies on o²S²PARC. Services accept inputs and produce outputs (which can be stored or used as input for other services). The functions of computational services are manifold and depend largely on the author’s intention. These functions can span from predicting cardiac contractile force to neural spike rates to simply summing two inputs. Most computational services have parameters that are editable by the user to explore the effects of these parameters on the simulation outputs.
Computational Study
A computational study is essentially a simulation project on o²S²PARC. It is visualized as a nested graph that represents a workflow of modeling services, how they are pipelined, and what the service parameters are. Some of the nodes might also represent other operations, such as retrieving or storing data from/to DAT-Core. A study is a conglomerate of a full simulation from input to final output and how the intermediate processing or computational steps are linked and should be reproducible. It primarily captures the setup, but can also include links to results.
Curation
The organization, annotation, publication and presentation of data according to a set of standards enforced by the SPARC data repository. The goal of curation is to ensure that data are organized in a consistent and machine-readable format and that the necessary metadata are available to interpret and reuse the data. Data curation includes both manual and automated processes.
DAT-Core
One of the 4 cores comprising the SPARC Data & Resource Center, responsible for storing, organizing, managing, and tracking access to data and resources generated by SPARC. See also SIM-Core, MAP-Core and K-Core.
Datasets
A dataset is a repository of data and metadata that can be selectively shared with users of the Pennsieve platform. Datasets can be private to its creator, shared selectively with users or teams in an organization or accessible to all users of an organization. Datasets can be published to Pennsieve Discover, and ultimately the SPARC Portal, where data is publicly available.
Digital Object Identifier (DOI)
A DOI is a globally unique, Persistent Identifier that uniquely identifies a digital object such as an article, data set or protocol. SPARC uses DOIs to identify data sets and protocols. SPARC DOIs for data sets are managed by datacite.org and are assigned when users publish a version of a dataset, or if they reserve a DOI prior to publishing on the Pennsieve platform. Data sets for protocols are issued by Protocols.io when a protocol is made public. The DOI of a dataset is the standard way to reference datasets in publications.
Embargo
SPARC datasets are subject to a 1-year embargo during which time the datasets are visible only to members of the SPARC Consortium. During embargo, the public will be able to view basic metadata about these data sets as well as their release date. More information about embargoed datasets can be found here.
FAIR Data Principles
High level principles designed to make data Findable, Accessible, Interoperable and Reusable for both humans and machines (Wilkinson et al., 2016). The principles encompass 15 guidelines designed to improve the usability of digital data. More details can be found at the GO-FAIR initiative. SPARC is adopting these principles, e.g., the use of persistent identifiers, FAIR vocabularies and community standards to ensure that SPARC data is FAIR.
Files
A self-contained package of information used by computer systems and applications that contain both input and output data.
Flatmaps
Two dimensional representations of anatomical structures and connectivity that serve as the query interfaces and visual representations of the SPARC knowledge graph.
Interlex
Interlex is a latform developed and maintained by the UCSD Fair Data Informatics Lab to make it easier for researchers to use and build FAIR vocabularies for data annotation and search. Interlex allows researchers to add their own terms and link them to existing ontologies. SPARC is using Interlex to enhance existing ontologies for use in SPARC.
K-Core
One of the 4 cores comprising the SPARC Data & Resource Center, functions as the curation and knowledge management hub for SPARC, working closely with the other cores on increasing the quality and FAIRness of SPARC datasets and building the SPARC Knowledge Graph and services. See also: DAT-Core, MAP-Core and SIM-Core.
Knowledge graph
Knowledge graphs allow users to search for a particular entity, e.g., a gene, plus search for related entities represented in machine-processable form.
Landing page
A web page that provides basic metadata about a data set or other digital object that is the first place a user “lands” when clicking on the DOI for that data set. The landing page provides basic information such as a title, description, author and license, but also provides information about what files are included in the data and how they are organized.
License
The means by which the copyright holder grants specific rights to the general public for reuse of the digital object. SPARC data is released under a public license, the CC-BY license, which means that provided that the licensees obey the terms and conditions of the license, copyright holders give permission for others to reuse or adapt their work provided that the creator of the data is attributed.
MAP-Core
One of the 4 cores comprising the SPARC Data & Resource Center, responsible for building interactive, modular, continually updated visualizations of nerve-organ anatomy and function. See also: SIM-Core, DAT-Core and K-Core.
Metadata
“Data about data”. Metadata provides additional information about a dataset. Metadata range from descriptive metadata, i.e., information that provides additional information about the source of context of the data, e.g., title, description, techniques, to structural information about the data set, e.g., how many files, what formats, data sizes, to administrative information, e.g., who owns the data and the license under which they are released.
Metadata model
The elements of metadata collected about a digital object and their organization. For example, the SPARC metadata model for a data set includes a title, description, author etc. The SPARC Minimal Information Standard includes information such as the organism used and attributes of the organism such as age, sex etc.
Minimal Information about a Neuroscience Data Standard (MINDS)
MINDS is a metadata standard developed by the European Human Brain Project through INCF that provides a set of standardized fields for describing neuroscience data sets. MINDs has not yet been formally released to the public, but as it represents a reasonable set of metadata fields, we have adopted its use in SPARC to organize some of the descriptive metadata that accompanies SPARC data sets.
Ontology
A set of concepts and categories in a subject area or domain that shows their properties and the relations between them. When these are encoded using formal logic-based computer languages, e.g., OWL (Web Ontology Language), a computer can perform some of the same types of reasoning as a human. For example, a computer would be able to reason that a dorsal root ganglion was part of the peripheral nervous system. Examples of ontologies in use in SPARC include UBERON and the Gene Ontology. SPARC uses ontologies for annotation of data, to enhance search and to encode knowledge arising from the SPARC project.
ORCID ID
ORCID provides a persistent digital identifier that distinguishes you from every other researcher and, through integration in key research workflows such as manuscript and grant submission, supports automated linkages between you and your professional activities ensuring that your work is recognized. SPARC encourages users to link their ORCID account to their Pennsieve profile as this information is associated with the dataset contributors when datasets are published. SPARC requires the corresponding contributor (the person who publishes the dataset) to associate their ORCID account.
Organization
An account on Pennsieve Data Management platform (i.e., SPARC Consortium).
o²S²PARC
The o²S²PARC is an online platform that hosts simulations of physiological processes contributed by SPARC groups and maintained by SIM-Core. Through the SPARC Portal, any user can directly investigate a particular contributed simulation through a “Run Simulation” link. They will then be able to change simulation parameters and run the simulation pipeline. Registered users who have an o²S²PARC account have access to greater functionality such as editing existing simulation workflows and creating their own simulations.
Pennsieve Data Management Platform
The Pennsieve Data Management Platform is a cloud-based platform for scientific data curation and management. This platform is used to prepare SPARC datasets for publication and to share and leverage datasets privately before data is made public.
Pennsieve Discover
Pennsieve Discover is a cloud-based platform where datasets from the Pennsieve Data Management platform are published and made publicly available. Pennsieve Discover provides tools to allow anyone to interact with public datasets as well as an API (Application Programming Interface) to programmatically navigate, browse, and discover new data.
Persistent Identifier
One of the core principles of FAIR is to use a persistent identifier (PID) to identify digital objects such as articles, data sets and protocols. Globally unique, persistent identifiers ensure that digital objects can be reliably found, that is, no broken links! PIDs are identifiers that are unique and never change: they point to one, and only one, digital object, e.g., a particular scientific article, and the persistent identifier may never be reused for another object. PIDs are issued by registries to ensure that the identifier is unique and that metadata that describes the object identified is available. Persistence of the ID is essentially a social contract: if you request a PID for an object, then you agree to keep the registry updated should the object move locations. For example, if a data set identified by a DOI moved to a new URL, the registry would have to be informed of the move. The FAIR principles require that the metadata that describes the object must persist even if the underlying object has been removed, i.e., if the data are no longer available for some reason, the identifier still resolves to a page, usually called a tombstone page, that describes the object and its disposition. A digital object identifier (DOI) is a well-known example of a PID and is used in SPARC.
PMR: Physiome Model Repository
The Physiome Model Repository(PMR) a model repository that includes some of the physics-based models, as well as the scaffolds that are generated for the SPARC project. Each model will have a unique ID, including a DOI. Includes version control. The repository is hosted using AWS servers located in the USA.
Properties
A data property refers to a property of a model. For example, a model "Person" may have the properties “First name” and “Last name”.
Protocols.io
Online platform for sharing, creating and managing experimental protocols. Datasets in SPARC are accompanied by detailed experimental protocols to provide details about how the data were collected. SPARC investigators are required to make these experimental protocols available through the SPARC group, which are then linked to the data set in Pennsieve Discover portal. Protocols are private to the consortium until the accompanying data are made public. Protocols released to the public are assigned a DOI so they can be appropriately referenced. Many journals allow links to protocols in Protocols.io to be included in research articles.
Provenance
Information on the origins and history of a data set or other digital object, e.g., a description of the workflow that led to the data, information on who generated or collected it? How were they processed? Does it contain data from someone else that you may have transformed or completed? Who to cite and/or how you wish to be acknowledged. Provenance is one of the FAIR principles, as understanding this type of information about a data set enhances its reusability and also allows credit and attribution for those that contributed to the data. As per FAIR principle R1.2, ideally information is described in a machine-readable format.
Publish
The act of making a data set or other digital object available to those outside of the SPARC community. A data set is considered published in SPARC when it is released to the Pennsieve Discover portal with a DOI and CC-BY license for reuse. For data sets still under embargo, the descriptive metadata about the data set are published with a DOI but no license, as access to the data requires permission from the author.
Records
A data record is an instance of a data model. If there is a model “Person” with properties “First name” and “Last name”, a sample data record could be: [Person 1: “First name = John”, “Last name = Smith”]
Relationships
A data relationship refers to a connection between records in the knowledge graph. For example, one can define a relationship between a “Study” and an “Experiment” by stating that a particular experiment “belongs-to” a particular study. These relationships are also called “edges” between the record “nodes” in the graph.
Revision
A dataset revision refers to an update made to dataset metadata (i.e., title, subtitle, description, etc.) that does not require an updated DOI.
Scaffold
A mathematical model providing a 3D coordinate framework for defining anatomical shape and other embedded anatomical data. The model uses high order (Cubic Hermite) finite element basis functions to capture complex geometry with a small number of parameters (defined at the ‘nodes’ of the finite element mesh) that can be optimised to fit the scaffold to anatomical measurements. A wide range of anatomical data (including models derived from that data) can be embedded in the scaffold as fields that are defined by additional nodal parameters – for example, muscle tissue structure, collagen density and orientation, vascular structure, neural connectivity, etc. The 3D shape of a scaffold can change with time (beating heart, breathing lung, filling and emptying of the bladder, peristalsis in the colon, etc.) but because the anatomical structures embedded within the scaffold are defined in terms of material coordinates, they move with the deforming scaffold. A 3D finite element mesh, of any type and at any specified spatial resolution, can be generated automatically from the scaffold for use with physics-based modelling of physiological function.
SciCrunch
A platform developed and maintained by the FAIR Data Informatics Lab at UCSD for providing unified search across independently maintained databases and other data resources. The platform includes data ingestion, curation tools and vocabulary services.
SciCrunch Knowledge Graph
Comprises the mark up of SPARC data and models with the SPARC vocabularies. The SciCrunch Knowledge Graph is used to enhance search across SPARC data.
SciGraph
SciGraph is an open source Neo4J ontology store that serves the SPARC vocabularies and will house the SPARC Knowledge Graph.
SED-ML:Simulation Experiment Description Markup Language
An XML format for encoding descriptions of simulation experiments (basic workflows) independent of the modelling format used to encode the models used in the experiment. A core standard of the COMBINE network.
Segmentation
The process of partitioning a digital image into multiple segments, generally to extract specific signals or structures from a complex image for the purpose of analysis or communication.
SIM-Core
One of the 4 cores comprising the SPARC Data & Resource Center, responsible for developing an online framework capable of hosting and connecting simulations to create predictive, multiscale, multiphysics models spanning from modulation sources acting at feasible access points through to organ functional responses. See also DAT-Core, MAP-Core and K-Core.
Simulations
This section of the SPARC Portal contains datasets that are computational models, simulations, data processing or data visualization pipelines that can be launched in SPARC's computational modeling platform o²S²PARC.
Software for Organizing Data Automatically (SODA)
SODA is software intended to simplify the organization and submission process of SPARC datasets by handling complex and/or repetitive tasks through an intuitive and interactive interface. The idea for SODA arose during the December 2018 SPARC Hackathon. SODA will provide an interactive interface that, without requiring any coding knowledge, walks SPARC investigators step-by-step through the data organization and sharing workflow all the while automating repetitive, complex and/or time-consuming tasks. SODA is distributed as a desktop application for Windows, MAC OS, and Linux. It is currently under development and will be released progressively as features are incorporated.
SPARC Anatomical Working Group (SAWG)
A working group comprised of anatomical experts who assist in the creation and vetting of SPARC vocabularies, flatmaps and scaffolds.
SPARC Data and Resource Center (DRC)
Supports the creation of the SPARC data portal, a multifunctional online hub facilitating coordination, synthesis, and prediction via four Core functionalities: Data Coordination, Map Synthesis, Modeling & Simulation and Knowledge Management. Funded SPARC investigators closely coordinate with the DRC in order to achieve the following core functions:
- Data Coordination Core (DAT-Core) - Store, organize, manage, and track access to data and resources generated by SPARC;
- Map Synthesis Core (MAP-Core) - Build interactive, modular, continually updated visualizations of nerve-organ anatomy and function;
- Modeling and Simulation Core (SIM-Core) - Develop an online framework capable of hosting and connecting simulations to create predictive, multiscale, multiphysics models spanning from modulation sources acting at feasible access points through to organ functional responses;
- Knowledge Management Core (K-Core) - Functions as the curation and knowledge management hub for SPARC, working closely with the other cores on increasing the quality and FAIRness of SPARC datasets and building the SPARC Knowledge Graph and services.
SPARC Data Set
A collection of related data and metadata generally produced by a single laboratory supported by SPARC, uploaded to the SPARC data platform and made available through the SPARC Portal.
SPARC Dataset Structure
The standard means for organizing and naming files for diverse data being generated by the SPARC Consortium. The standard is based on the BIDS format developed originally for neuroimaging. Files are organized into folders and accompanied by a set of descriptive files that contain information on subjects, experimental information and data set descriptions. Folders and files are named according to a standard naming convention. The SPARC Dataset Structure also provides a means to extend the core structure to accommodate most data acquisitions. The use of a common standard facilitates data reuse and integration.
SPARC Knowledge Graph
Knowledge + Data + Models produced by SPARC. It comprises the following:
- Data sets annotated to the SPARC Minimal Information Standard (MIS) for data and SPARC Vocabularies;
- MIS for models and simulations;
- Reference knowledge encoded from community ontologies and extended by SPARC investigator and knowledge extracted from the literature.
SPARC Material Sharing Policy
The guidelines and polices to which SPARC OT awardees must adhere as a member of the SPARC Consortium are provided in the SPARC Material Sharing Policy document.
SPARC Minimal Information Standard (MIS)
The minimal metadata and data model for SPARC research objects:
SPARC Dataset MIS The minimal information model for SPARC datasets, developed by the SPARC Standards Committee. The MIS is encoded in TTL/OWL and is viewable using the Protege Ontology Editor.
The MIS has evolved over time as SPARC standards evolve or new standards are incorporated. For example, MIS 3.0 contains the SPARC Standards for Optical Microscopy Imaging Data and Imaging Metadata. A whitepaper describing the SPARC MIS can be found here.
SPARC Optical Microscopy Imaging Data and Imaging Metadata Standard
SPARC Standards for Optical Microscopy Imaging Data and Imaging Metadata ensures a consistent set of metadata specific to microscopy image data across the SPARC ecosystem. This imaging MIS is approved and in place.
SPARC Portal
The SPARC Portal is an integrated online platform where users can browse datasets generated by SPARC groups, interact with and discover data with flatmaps and run simulations of physiological processes. This is the main entrypoint for users to access contributions of the SPARC teams. Services within the portal have been provided by MAP-Core, DAT-Core, SIM-Core and K-Core.
SPARC Vocabularies
Set of community ontologies used by SPARC to annotate data and models + custom extensions produced specifically for SPARC. Current ontologies used by SPARC include UBERON, the multi-species anatomy ontology, supplemented by terms from the Foundational Model of Anatomy, multiple brain atlases and others. These community ontologies have been imported into the NIFSTD ontology, which provides the backbone of the SPARC vocabularies.
Teams
A group of users on the Pennsieve Data Management platform in a single group (i.e., SPARC Data Curation Team).
Team Black
A diverse group of data analysts and simulations experts based in four different countries who provide feedback regarding the usability and functionality of the o²S²PARC platform. This feedback is incorporated into the o²S²PARC 4-week development cycles.
Team Blue
A group of SPARC program scientists who provide feedback on the design and functionality of the SPARC portal at 6-week development cycles.
Team Red
A group of subject-matter experts who provide high-level advice to help drive future development of MAP-Core visualizations and tools along with their application on the SPARC Portal as a valuable tool in designing and testing neuromodulation devices.
Version
A dataset version refers to a DOI-specific, version-controlled iteration of a dataset. A new version of a dataset must be released when there are any changes to the files or scientific metadata made within a dataset.