meta-codex,

Reimagining EEG Data for Collective Listening

Lina Lopes Lina Lopes Follow May 02, 2025 · 19 mins read
Share this

EEG Data Sharing and Analysis Platforms

Below is a mapping of major platforms that facilitate EEG data sharing and/or analysis. We list both open-source and proprietary (with preference to open platforms), including key details for each:

OpenNeuro

  • Overview: OpenNeuro (formerly OpenfMRI) is a free, open neuroimaging data repository launched around 2017–2018pmc.ncbi.nlm.nih.gov. It is one of the largest archives for brain imaging data, including EEG, iEEG, MEG, MRI, etc.

  • Institution/Origin: Led by Stanford University (USA), initially funded as part of the NIH BRAIN Initiative in 2018 pmc.ncbi.nlm.nih.gov. Developed from the earlier OpenfMRI project, it embraces community-driven data sharing.

  • Current Status: Active – OpenNeuro is actively maintained with over 1,300 public datasets as of 2025 elifesciences.org. New datasets are regularly added and validated. The latest interface update and dataset count are visible on the website.

  • Standards: Enforces BIDS (Brain Imaging Data Structure) format – all uploads must pass a BIDS validator pmc.ncbi.nlm.nih.gov. This standardization simplifies file organization and metadata for broad accessibilitypmc.ncbi.nlm.nih.gov.

  • Data Access: Allows public uploads (any researcher can contribute a dataset, which becomes public after a defined embargo) and public downloads without login. Data are shared under CC0 or similar licenses.

  • Tools/Integration: Primarily a repository (web portal) for data storage. It integrates with analysis tools: e.g. quality checks and an analysis partner platform NEMAR (see below) for EEG/MEG/iEEG datapmc.ncbi.nlm.nih.gov. Users can also pipe OpenNeuro data to external tools like Brainlife or cloud computing resources.

NEMAR (NeuroElectromagnetic Data Archive and Tools Resource)

  • Overview: NEMAR is an EEG/MEG data portal built on top of OpenNeuro. It launched in 2019 as a collaboration between the SCCN/UCSD (EEGLAB team) and OpenNeuro pmc.ncbi.nlm.nih.gov. NEMAR provides specialized tools to curate, search, and analyze EEG, MEG, and iEEG datasets stored on OpenNeuro.

  • Institution/Origin: Developed by the Swartz Center for Computational Neuroscience, UC San Diego (USA), in partnership with OpenNeuro pmc.ncbi.nlm.nih.gov. Funded by NIH grants to extend OpenNeuro’s functionality for electrophysiology data.

  • Status: Active – NEMAR serves as a front-end web portal (nemar.org) synchronized with OpenNeuro’s database. It doesn’t host new data independently (no separate uploads) – rather it indexes OpenNeuro’s EEG/MEG datasets and offers enhanced services pmc.ncbi.nlm.nih.gov.

  • Standards: Inherits BIDS from OpenNeuro. Notably, NEMAR was the first open EEG archive to implement Hierarchical Event Descriptors (HED tags) for detailed event metadata headit.ucsd.eduheadit.ucsd.edu. It encourages rich metadata and standard ontologies for EEG events.

  • Data Access: All EEG/MEG datasets on OpenNeuro are accessible. NEMAR provides an analysis suite: users can run EEGLAB pipelines in the cloud on selected datasets, perform quality assessments, and visualize data without needing to download locallypmc.ncbi.nlm.nih.gov.

  • Tools: Analysis-focused – includes cloud computing integration (via NSF Open Science Grid) to preprocess or analyze datasets in-browser. It’s essentially a web analysis platform paired with the OpenNeuro repositorypmc.ncbi.nlm.nih.gov. Data remains stored on OpenNeuro, while NEMAR adds analysis and visualization capabilities.

EEGBase (EEG/ERP Portal)

  • Overview: EEGBase is an open-source web portal for EEG/ERP experiment data, originating in the mid-2010s (publicly announced by 2014) and officially released by 2017 nitrc.org. It supports long-term storage, annotation, management, and sharing of EEG/ERP data and metadataneuroinformatics.kiv.zcu.cz.

  • Institution/Origin: Developed at the University of West Bohemia in Pilsen (Czech Republic) nitrc.org. The project is part of a neuroinformatics initiative (KIV/NTIS) focusing on standardizing EEG data management. The platform’s software is open-source (Apache 2.0 license) nitrc.org and available on GitHub github.com.

  • Status: Active (with registration) – EEGBase requires users to create an account to upload or download data. It contains many EEG/ERP datasets contributed by the community under various licenses (e.g. open for non-commercial use)neuroinformatics.kiv.zcu.cz. The platform is maintained with periodic updates; it was registered on NITRC in 2017, indicating ongoing support nitrc.org.

  • Standards: Incorporates emerging open standards. EEGBase’s data model aligns with current standardization efforts: it supports semantic metadata (RDF/OWL ontologies) and can export data in HDF5 and other formatsneuroinformatics.kiv.zcu.czneuroinformatics.kiv.zcu.cz. It predates BIDS, but recent efforts likely added BIDS compatibility for data downloads. (It also supports the EEG Study Schema (ESS) and HED tagging via integration with SCCN tools, to richly describe eventsheadit.ucsd.edu.)

  • Data Access: Public and controlled – Researchers can upload datasets (with custom metadata templates) and choose to keep them private, share with collaborators, or release publicly headit.ucsd.eduheadit.ucsd.edu. Many fully public EEG/ERP datasets are available for download after free registrationneuroinformatics.kiv.zcu.cz. The portal includes search and filter functions to find data by metadata, paradigm, etc.

  • Tools: Repository with built-in analysis – EEGBase is not just a file store; it provides a web interface for basic EEG data visualization and analysis. It enables online annotation, metadata editing, and even running some analyses on the server sideneuroinformatics.kiv.zcu.cz. However, its primary role is data management; analysis tools are simpler compared to heavy computing platforms.

HeadIT

  • Overview: HeadIT (Human Electrophysiology Anatomic Data and Integrated Tools) is an early EEG data-sharing platform, launched circa 2008–2010. It was one of the first repositories for fully annotated human EEG datasets, enabling researchers to share raw data with rich metadata for re-analysis headit.ucsd.edu.

  • Institution/Origin: Developed at the Swartz Center for Computational Neuroscience (SCCN), UC San Diego (USA), under NIH funding (NIH grants R01-MH084819 and R01-NS047293) headit.ucsd.edu. Scott Makeig and colleagues created HeadIT as part of the SCCN’s EEGLAB ecosystem. (Originally accessible via HeadIT.org, now headit.ucsd.edu rrid.site.)

  • Status: Partially active – The HeadIT site is online and allows browsing and downloading of many EEG studies. However, in recent years the growth has slowed as newer platforms (OpenNeuro/NEMAR) emerged. HeadIT still hosts “hundreds of raw recordings” across a range of EEG experiments, all with detailed task and event descriptions headit.ucsd.edu. Researchers can still create accounts to upload new data, though the community momentum has shifted to newer standards.

  • Standards: Uses its own XML-based standards: EEG Study Schema (ESS) for experiment metadata and Hierarchical Event Descriptors (HED) for tagging events in EEG recordings headit.ucsd.edu. These were pioneering efforts toward standardization and have influenced BIDS EEG and other frameworks. HeadIT does not natively use BIDS (being older), but its richly annotated data can be converted.

  • Data Access: Public download, account for upload – Anyone can download publicly released HeadIT datasets without login (after agreeing to a data use agreement) headit.ucsd.edu. Uploading data or accessing private datasets requires an account headit.ucsd.edu. Data contributors can set access controls (private, shared with specific users, or fully public) headit.ucsd.eduheadit.ucsd.edu.

  • Tools: Repository with metadata tools – HeadIT’s focus is on sharing and meta-analysis. It doesn’t provide cloud compute pipelines, but it ensures each dataset is comprehensively described (every event code and paradigm explained) for reuse headit.ucsd.edu. By storing data in common formats (.set, .EDF, etc.), it allows researchers to download and analyze locally (e.g., with EEGLAB). Its integration with ESS/HED aimed to facilitate meta-analyses across studies.

PRED+CT (Patient Repository of EEG Data + Computational Tools)

  • Overview: PRED+CT is an open-source EEG platform focused on clinical (patient) EEG data and integrated analysis tools. Proposed in late 2017 pubmed.ncbi.nlm.nih.govpubmed.ncbi.nlm.nih.gov, it set out to be a “one-stop” site for gathering EEG data from patients with neurological/psychiatric conditions, alongside standardized tasks and analysis pipelines pubmed.ncbi.nlm.nih.govfrontiersin.org. PRED+CT was inspired by OpenfMRI/OpenNeuro, aiming to fill the gap for EEG clinical data sharing frontiersin.org.

  • Institution/Origin: Initiated at the University of New Mexico (USA) by James Cavanagh and colleagues pubmed.ncbi.nlm.nih.gov (psychology and CS departments). It leverages a web framework called “Predict” (predictsite.com) pubmed.ncbi.nlm.nih.gov. Development was detailed in a 2017 Frontiers in Neuroinformatics article pubmed.ncbi.nlm.nih.gov.

  • Status: Active (limited) – PRED+CT’s website was launched (predictsite.com) with capabilities to Upload, Download, and use Computational Tools (as illustrated in its interface)pubmed.ncbi.nlm.nih.gov【41†】. However, the growth has been modest. It remains a community resource where some clinical EEG datasets are available. For example, a depression vs. control resting-state EEG dataset is part of PRED+CT researchgate.net, and a Parkinson’s disease EEG dataset (patients ON vs OFF medication) was used in demonstrations pubmed.ncbi.nlm.nih.gov. The platform is maintained by the UNM team, but the user base is still emerging.

  • Standards: Encourages standard tasks and formats. PRED+CT uses OpenNeuro’s model as a guide frontiersin.org, so it supports BIDS structure and common EEG formats (.set, .EDF, .mat). All data are in Matlab-readable formats (EEGLAB .set or .mat) for compatibilityresearchgate.net. It also emphasizes common clinical task paradigms (e.g., oddball, flanker, resting EEG) so data can be comparable across studies frontiersin.orgfrontiersin.org.

  • Data Access: Open access – The download section is “fully open (no login or request required)” researchgate.net. Researchers can freely obtain datasets and associated task definitions. Uploads require user accounts and presumably some curation. PRED+CT aims to include matched healthy controls for patient data and be more accessible than clinic repositories frontiersin.org.

  • Tools: Integrated analysis tools – True to its name, PRED+CT also hosts computational tools/scripts for EEG analysis. Users can browse “Tasks” (standard experiment paradigms), download example data, and even use built-in analytic pipelines (e.g., for machine learning classification of patient vs control EEG) pubmed.ncbi.nlm.nih.gov. The vision is a platform where one can not only access data but also run large-scale data mining to identify EEG biomarkers for disorders pubmed.ncbi.nlm.nih.gov. (This aspect is still developing as data accumulate.)

DANDI (Distributed Archive for Neurophysiology Data Integration)

  • Overview: DANDI is a cloud-based repository for neurophysiology datasets, launched in 2019 as part of the BRAIN Initiativepmc.ncbi.nlm.nih.gov. It accepts electrophysiology data of all scales – from single neurons and local field potentials to EEG/MEG – as well as optical physiology and behavior data. DANDI packages datasets as “Dandisets” with rich metadata.

  • Institution/Origin: Led by researchers at MIT, Dartmouth, and catalystneuro.org (USA)pmc.ncbi.nlm.nih.gov. It was funded by an NIH grant in 2019 to promote data sharing and standardization (especially the Neurodata Without Borders format).

  • Status: Active – DANDI is a growing repository. It is actively maintained on the GitHub platform (using the Dataverse model and DANDI CLI tools). As of 2025, numerous labs have contributed cellular recordings and some EEG/iEEG datasets. It’s open-source and community-driven, encouraging neuroscientists to deposit data for reuse.

  • Standards: Emphasizes NWB (Neurodata Without Borders) for cellular and intracranial electrophysiology, and BIDS for human neuroimaging datapmc.ncbi.nlm.nih.gov. Every Dandiset must follow a BIDS-like hierarchy (with JSON/YAML metadata and organized file tree) prepared via the DANDI CLIpmc.ncbi.nlm.nih.gov. DANDI’s team contributes to BIDS and NWB standards developmentpmc.ncbi.nlm.nih.gov. In practice, many iEEG/ECoG datasets on DANDI use NWB 2.0, and any MRI/EEG components use BIDS for consistencypmc.ncbi.nlm.nih.gov.

  • Data Access: Fully public – All DANDI datasets are accessible through a web UI (dandiarchive.org) and via API/CLI. Users can browse, preview, and download data (either entire sets or selective files). Uploading requires using their CLI tool and an account (ensuring data meets format requirements). There are no usage restrictions beyond citation and license terms (typical datasets use open licenses).

  • Tools: Repository with programmatic access – DANDI provides a modern interface and tools for data analysis: an interactive web neurovisualizer for quick look at data, and Python clients to stream data into analysis pipelines. It is tightly integrated with Jupyter and cloud computing – e.g., one can launch Google Colab or Binder notebooks pre-loaded with a Dandiset. The focus is on making large-scale neurophys data easily reusable by computational scientists (hence integration with libraries for signal processing and ML).

DABI (Data Archive for the BRAIN Initiative)

  • Overview: DABI is a data repository created under the NIH BRAIN Initiative to share human and animal neurophysiology data, with a focus on intracranial EEG (iEEG/ECoG) and high-density electrophysiology. It was funded and launched in 2018pmc.ncbi.nlm.nih.gov. DABI’s mission is to streamline dissemination of data from BRAIN Initiative projects, while allowing investigators control over sensitive clinical datapmc.ncbi.nlm.nih.gov.

  • Institution/Origin: Hosted at the University of Southern California (USC) Stevens Neuroimaging and Informatics Institute (USA)pmc.ncbi.nlm.nih.gov. Dominique Duncan and colleagues manage DABI, in collaboration with multiple BRAIN Initiative consortium sites.

  • Status: Active – DABI is a live repository with an evolving collection of datasets (primarily invasive recordings, but also some scalp EEG and multi-modal data). It provides a secure environment where investigators can upload data (fulfilling NIH sharing mandates) and choose when to release it publicly. Many datasets become public after an embargo, and dozens of iEEG datasets are openly available now.

  • Standards: Flexible, with NWB and BIDS encouraged – DABI accepts multiple data formats to lower the barrier for investigatorspmc.ncbi.nlm.nih.gov, but it strongly encourages NWB and BIDS for standardizationpmc.ncbi.nlm.nih.gov. It can store not just EEG/iEEG but also MRI, CT, behavioral data etc., organizing them by study with rich metadata (it even supports the NeuroImaging Data Model, NIDM, for certain data types)pmc.ncbi.nlm.nih.gov. Several DABI datasets use NWB for time-series and BIDS for accompanying imaging datapmc.ncbi.nlm.nih.gov.

  • Data Access: Hybrid model – DABI allows investigators to retain ownership and control access. Some datasets are fully public (downloadable via DABI’s web interface after agreeing to terms), while others require submitting a data request to the ownerspmc.ncbi.nlm.nih.gov. This is to accommodate sensitive clinical data (e.g., patient identifiers). DABI fulfills NIH mandates by making data available, but not always instantly open without oversightpmc.ncbi.nlm.nih.gov. Uploading typically requires being part of a BRAIN Initiative project or by arrangement with DABI.

  • Tools: Repository with basic analytics – DABI’s platform includes a web portal for searching and viewing metadata, and basic visualization of iEEG traces. It also partners with external tools; for example, it supports the NWB Explorer for viewing NWB files in-browser. DABI itself doesn’t offer full analysis pipelines, but by using common standards (NWB/BIDS), it ensures compatibility with a range of neurophysiology analysis toolkits.

Brain-CODE

  • Overview: Brain-CODE is a secure data platform established in 2012pmc.ncbi.nlm.nih.gov by the Ontario Brain Institute (OBI) in Canada. It is designed as a large-scale integrated data repository for neuroscience research, especially multi-modal and multi-site studies. Brain-CODE is more of an ecosystem than a simple repository: it handles clinical assessments, EEG/ERP, MRI, genotypes, and more, all under one framework for Ontario-based research programs.

  • Institution/Origin: Developed by the Ontario Brain Institute (Canada) in partnership with Indoc Research and others. It launched in 2012, making it one of the earlier big neuroscience data platformspmc.ncbi.nlm.nih.gov. The platform was tailored to support OBI’s province-wide studies (e.g. in neurodevelopment, neurodegeneration, epilepsy, etc.).

  • Status: Active (restricted) – Brain-CODE is actively used to manage data from OBI’s funded programs. It is not an open public repository like OpenNeuro; instead, data are stored in a secure environment. Approved researchers can request access to specific datasets (often after data governance review). The platform continues to be updated with new features (e.g., federated data queries and virtual workspaces for analysis pmc.ncbi.nlm.nih.gov pmc.ncbi.nlm.nih.gov).

  • Standards: Embraces multiple standards. Brain-CODE uses BIDS for organizing neuroimaging and electrophysiology datapmc.ncbi.nlm.nih.gov. It also uses REDCap/OpenClinica for clinical data capture, and other domain-specific standards for genomics, etc.pmc.ncbi.nlm.nih.gov. Uniquely, it has a central federation system that links data across modalities by participant, using consistent IDs and ontologiespmc.ncbi.nlm.nih.gov. NWB is not a primary format here (focus is more on BIDS for EEG and standard clinical EEG formats). They enforce data quality checks and common data elements across sites.

  • Data Access: Controlled – Brain-CODE holds a wealth of EEG and ERP data (for example, EEG from the Ontario OCD registry, various cognitive task EEGs, etc.), but access is permission-based. Researchers typically apply to the specific research program or use a data access committee process. Once approved, data can be accessed through Brain-CODE’s portal. The platform provides secure virtual desktops and Jupyter notebooks to work with data within the environment pmc.ncbi.nlm.nih.govpmc.ncbi.nlm.nih.gov, rather than downloading sensitive data to local machines.

  • Tools: Comprehensive analysis environment – Brain-CODE includes built-in processing pipelines and analysis tools in a secure sandbox pmc.ncbi.nlm.nih.gov. Users can launch preconfigured pipelines for EEG preprocessing or MRI analysis. There are visualization dashboards and the ability to run custom analyses via virtual machines. The emphasis is on data integration: for example, correlating EEG findings with MRI or clinical measures via the central system. It’s a one-stop platform: data ingestion, curation, quality control, and analysis all happen within Brain-CODE’s infrastructure pmc.ncbi.nlm.nih.gov.

PhysioNet

  • Overview: PhysioNet is a pioneering open data repository for physiological signals, established in 1999 physionet.org. While famous for cardiac and sleep data, it also hosts numerous EEG datasets (especially clinical EEG, e.g. epilepsy, sleep EEG, etc.). PhysioNet provides not only data archives but also open-source software tools for analyzing physiological time-series lyrasisnow.org.

  • Institution/Origin: Created by the MIT Laboratory for Computational Physiology with NIH support (USA) lyrasisnow.org. It was one of the first platforms to make large collections of biomedical signals freely available to the research community. Initially focused on cardiology (ECG) and critical care data, it expanded to EEG and other neural data over time.

  • Status: Active – PhysioNet is robustly maintained (currently funded by NIH as a national resource). It continues to release new datasets annually, often through the PhysioNet Challenges. As of 2025, PhysioNet hosts several EEG datasets: e.g., the CHB-MIT Scalp EEG Database (pediatric epilepsy EEGs), the TUH EEG Corpus (Temple Univ. Hospital clinical EEG, a large ongoing set), Sleep-EDF database (EEG polysomnography for sleep), among others. Many are updated or augmented with new data periodically.

  • Standards: Not BIDS by default – PhysioNet predates BIDS, and each dataset may have its own format. Many EEG datasets are provided in formats like European Data Format (EDF) or custom binary formats with accompanying documentation. There is a trend towards more standardized descriptions: for example, recent contributions might include CSV/JSON metadata or follow emerging standards in the field, but BIDS compliance is not strictly required. PhysioNet does use an internal metadata schema and DOI system for each dataset, and encourages contributors to provide extensive documentation.

  • Data Access: Public (with credentialing) – Most PhysioNet data is freely downloadable after a quick user credentialing (one must create an account and agree to data use policies due to some health data sensitivities). Certain datasets with sensitive information are credentialed – requiring proof of human-subjects research training before access. In general, anyone can obtain the data for research/non-commercial use. Uploading a dataset to PhysioNet involves a rigorous review process for quality and documentation.

  • Tools: Analysis software and challenges – PhysioNet provides the WFDB toolkit for signal processing, and hosting of code in the companion PhysioNetWorks area. It frequently runs open challenges that provide datasets and ask participants to develop algorithms (including EEG-based challenges like seizure detection). While it’s primarily a data repository, the PhysioNet website allows basic viewing of signals and offers cloud notebooks for some datasets. Its longevity and curation make it a goldmine for developing and benchmarking EEG analysis methods physionet.org.

Others and Notable Mentions

  • NIMH Data Archive (NDA): A large NIH-run repository (USA) for mental health research data, including EEG data from many clinical studies. Launched ~2015, it merges NDAR, ABCD, and other initiatives. Not openly browsable – requires proposals and access approvals. NDA is not EEG-specific and has a cost for data depositionpmc.ncbi.nlm.nih.gov, but it serves as a mandated archive for many NIH-funded EEG studies (especially developmental or psychiatric cohorts). It supports BIDS and other formats, but data are typically not public by default.

  • Brainlife.io: An open-source cloud platform (USA) for neuroimaging analysis rather than raw sharing. Launched ~2017, it allows users to upload data (including EEG) or fetch open data (it integrates with OpenNeuro) and run processing pipelines in the cloud. Brainlife supports BIDS and offers a marketplace of analysis “apps.” It is active and maintained by Indiana University – useful for researchers who want to analyze EEG data online and share results. (Datasets on Brainlife can be private or public; often Brainlife links to data stored on other repositories like OpenNeuro openneuro.org.)

  • Kaggle: A popular data science competition platform (owned by Google, launched 2010). Not specific to EEG, but has hosted EEG datasets for machine learning challenges (e.g., seizure prediction contests, EEG emotion recognition). Kaggle allows public data uploads/downloads with a user account. It does not enforce scientific standards (no BIDS), but it’s a widely used outlet for sharing preprocessed EEG for ML applications. For instance, the DEAP dataset (music EEG for emotion) is available on Kaggle kaggle.com. Kaggle is active and convenient, though researchers should verify data quality since it’s not curator-reviewed like the above platforms.

  • Zenodo / Figshare / OSF: General-purpose research data repositories that are open and citable. Many EEG researchers use these to share data alongside publications. They are not EEG-specific but deserve mention. For example, the Dryad repository was used to share a rich multi-experiment EEG dataset for speech comprehension (see Dryad-Speech in Part 2)github.com. These platforms usually allow public download and use DOIs; standards like BIDS are optional (user-dependent). They are excellent for long-tail datasets not hosted on the major EEG platforms.

Lina Lopes
Written by Lina Lopes
Hi, I’m Lina — a consultant, artist, and machine whisperer. I work with data and machine learning to explore radical imagination across science, technology, and art. I’m also known as Diana’s mother