Skip to Main Content
FACT SHEET

Capability: Data analytics

Summary

We possess a depth of expertise in collecting, processing, managing and analyzing large datasets as part of our research activities. This empowers us with a unique capability to support longitudinal studies involving large numbers of research participants and large data sets of/involving various data types. 

We leverage established computational infrastructure and data pipelines for the collection, management, organization and pre-processing of these data, which enable us to quickly spin up projects, expeditiously conduct quality assurance on data flows, and move to analytics and research at the earliest stage.

Key Elements 

Three key elements underlie the foundation for this expert capability: 

  1. Tested data collection methods and processing pipelines for multiple data types, and a sustained focus on improvement. By working in step with a core of DGC team members with career expertise in large clinical study management and data management, the DGC has driven the development of processing pipelines and best practices that optimize standardized longitudinal data collection at scale. In line with the DGC core value of continuous evaluation and course correction, a data analytics workgroup convenes monthly to address structural and operational issues and implement refinements. 
  2. Sophisticated computational infrastructure that powers the DGC’s ability to conduct analyses, honor sponsor data requirements, and protect participant privacy. The DGC computing environment is structured to offer and oversee triaged access to deidentified data. The environment has multiple checks and layers of security to ensure the privacy and protection of sensitive data and to protect research participants’ confidentiality. This infrastructure meets stringent sponsor and institutional requirements while still providing the high-performance computational power necessary for advanced data analysis. The data analytics workgroup regularly meets to identify steps necessary to improve computation, from speeding up data pre-processing to evaluating analytics steps, with the overall goal of reducing resource utilization. 
  3. Availability and integration of distinct data types that support the DGC’s unique approach to better understanding and researching depression. DGC team members have extensive experience integrating and analyzing a breadth of data types collected from individual research participants: electronic health records, self-report and interviewer-administered assessments, neuroimaging data, genetic data and data captured from digital sensors. By integrating these different types of data, DGC researchers elucidate better understanding of depression — as depression is not a singular, one-size-fits-all condition, but rather a condition with different subtypes characterized by different symptoms, causes and trajectories.

Our team comprises specialized experts to support these capabilities 

DGC team members have specialized expertise and core competencies that support complex and large-scale data collection, storage, transfer and analysis across many different kinds of research and research questions. With unique datasets, we are able to attract graduate students and early career faculty who join the team to increase our understanding of depression and who use the data to develop new analytical methods and tools for analyzing these data. 

Experts supporting data analytics environment and functions

The data analytics team is comprised of faculty, full-time staff, graduate students and administrators each with one or more of the following areas of expertise: 

  • Software engineering
  • Computer science, including those with expertise in artificial intelligence
  • Biomedical informaticians
  • Statisticians
  • Biostatisticians
  • Data scientists
  • Subject matter experts for specific data types:
    • Neuroimaging
    • Digital sensing and mHealth 
    • Neurocognitive and neuropsychological assessments
    • Genetics
    • Electronic health records
  • Administrators and project managers to manage sponsor requirements and data access rights
Primary Goal

To leverage our comprehensive datasets and our best-in-class processes for managing these data to increase scientific understanding of the causes and trajectories of depression. 

Related Information
Status
Active
Funding Source

UCLA