Skip to main content
100 years and 12 days since the five-day weekRead the story
Posted about 1 hour ago

Data Delivery Specialist

5 day weekGenerous PTOOnsite · South San Francisco, USA

A healthier future. It’s what drives us to innovate. To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come. Creating a world where we all have more time with the people we love. That’s what makes us Roche.

Advances in AI, data, and computational sciences are transforming drug discovery and development. Roche’s Research and Early Development organisations at Genentech (gRED) and Pharma (pRED) have demonstrated how these technologies accelerate R&D, leveraging data and novel computational models to drive impact. Seamless data sharing and access to models across gRED and pRED are essential to maximising these opportunities. The new computational sciences Center of Excellence (CoE) is a strategic, unified group whose goal is to harness this transformative power of data and Artificial Intelligence (AI) to assist our scientists in both pRED and gRED to deliver more innovative and transformative medicines for patients worldwide.

The Computational Sciences Center of Excellence (CS CoE) brings together data, AI, and computational expertise to accelerate innovation across gRED and pRED. Within CS CoE, the Data and Digital Catalyst (DDC) organization leads the modernization of our data ecosystem, enabling scalable, data-driven science.

The Data Capability organization within DDC is responsible for establishing foundational data capabilities, including data connectivity, data compliance, scientific content management and data ingestion, curation, integration, and delivery. The team ensures that high-quality, well-structured datasets are available to power analytics, AI/ML, and scientific discovery across Research and Early Development.

The Opportunity:

We are seeking an Associate Data Delivery Specialist to support the delivery and operationalization of real-world data (RWD) and clinical-genomic datasets sourced from external partnerships and public/purchased data collections.

In this entry-level role, you will contribute to the coordination, preparation, and delivery of multimodal, high-dimensional datasets, ensuring they are accessible, well-documented, and ready for use in research, analytics, and AI/ML workflows. You will also support interactions with external data providers and internal stakeholders to ensure efficient and compliant data usage.

You will work within a cross-functional environment spanning data engineering, data science, and research teams, helping to enable data-driven discovery across Roche’s R&D ecosystem.

  • RWD Data Operations & Delivery

    Support intake, tracking, and fulfillment of real-world data requests, including clinical-genomic and multimodal datasets. Assist in preparing datasets for delivery, ensuring completeness, quality, and documentation.

  • External Data Coordination

    Coordinate with external partners (e.g., Caris, FMI) to support data requests, query submissions, and data returns. Assist in managing communications, timelines, and deliverables.

  • Data Governance & Access Support

    Assist in managing data access workflows, ensuring appropriate approvals, training, and compliance with data usage agreements. Track data usage and maintain documentation.

  • High-Dimensional Data Handling

    Work with sequencing, imaging, and proteomics datasets, supporting standardized formatting, validation, and integration readiness. Contribute to handling emerging multimodal data types and evolving standards.

  • Data Delivery & Quality Control

    Perform quality checks, metadata validation, and documentation to ensure datasets are analysis-ready. Support troubleshooting of data delivery issues and escalate when necessary.

  • AI-Assisted Data Curation Support

    Contribute to early-stage efforts in AI-enabled data curation and harmonization, supporting improved scalability and efficiency in data delivery workflows.

  • Collaboration Across Teams

    Partner with internal teams (e.g., AIBT, CBM, gRED TM, pRED DTAs) to support data integration and delivery needs across diverse scientific use cases.

Who You Are:

  • PhD and 0-2 years of experience, Master’s degree and 3-5 years of experience or a Bachelor’s degree and 4-7 years of experience in Data Science, Bioinformatics, Health Informatics, Biomedical Engineering, Computer Science, or a related field and experience working with real-world data, clinical data, or biomedical datasets

  • Basic understanding of RWD sources (e.g., EHR, claims, registries, clinical-genomic datasets)

  • Strong attention to detail and commitment to data quality and reliability

  • Strong organizational and communication skills, with the ability to support multiple stakeholders

  • You are someone who has the technical skills for: Programming: Python (Pandas) or SQL; familiarity with Bash is a plus. Data Formats: Experience with structured data (CSV, JSON, Parquet); exposure to scientific formats is a plus. Data Platforms: Exposure to cloud environments (AWS S3, GCS, or Azure). Tools: Familiarity with Jupyter notebooks, data portals, or workflow tools is beneficial

Preferred Qualifications:

  • Exposure to clinical-genomic or multimodal datasets (e.g., Caris, FMI, or similar)

  • Familiarity with data governance and compliance in healthcare or life sciences

  • Exposure to AI/ML workflows or data preparation for analytics

  • Understanding of FAIR data principles and metadata standards

  • Interest in working with external data partnerships and large-scale data ecosystems

Onsite presence, on our South San Francisco campus, is expected for at least 3 days a week.

Relocation benefits are not available for this job posting.

The expected salary range for this position based on the primary location of California is $127,800 - $237,300.  Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law.  A discretionary annual bonus may be available based on individual and Company performance.  This position also qualifies for the benefits detailed at the link provided below.

Benefits

#LI-JD1

#ComputationCoE

Genentech is an equal opportunity employer. It is our policy and practice to employ, promote, and otherwise treat any and all employees and applicants on the basis of merit, qualifications, and competence. The company's policy prohibits unlawful discrimination, including but not limited to, discrimination on the basis of Protected Veteran status, individuals with disabilities status, and consistent with all federal, state, or local laws.

If you have a disability and need an accommodation in relation to the online application process, please contact us by completing this form Accommodations for Applicants.