Metadata
- Name
- Quantified dataset: Overexpression
- Repository
- ZENODO
- Identifier
- doi:10.5281/zenodo.4288515
- Description
- This repository contains the quantified single cell dataset for the chimeric overexpression spheroid experiment described in:
A quantitative analysis of the interplay of environment, neighborhood, and cell state in 3D spheroids
Vito RT Zanotelli
Matthias Leutenegger
Xiao‐Kang Lun
Fanny Georgi
Natalie de Souza
Bernd Bodenmiller
Mol Syst Biol. (2020) 16: e9798
https://doi.org/10.15252/msb.20209798
Please cite this article if you re-use any of the data or code.
This is an export of the processed dataset after quality control. Please consult the README bellow for a description of the data.
An example script to browse this data using Python can be found here: https://github.com/BodenmillerGroup/SpheroidPublication/blob/oexp_analysis/workflow/notebooks/99_browse_export_data.py.ipynb
or be interactively tried on Google Colab:
https://colab.research.google.com/github/BodenmillerGroup/SpheroidPublication/blob/oexp_analysis/workflow/notebooks/99_browse_export_data.py.ipynb
Export Oexp Analysis
by Vito Zanotelli et al, Bodenmiller Lab UZH, 2020
This is the export of the overexpression dataset from the paper: "A quantitative analysis of the interplay of environment, neighborhood and cell state in 3D spheroids" Raw data: 10.5281/zenodo.4055780 Please cite the paper if you use this data!
###Experimental design (More details in the paper):
Overexpressing 51 signaling constructs, 4 control contstructs (2x GFP, 1x HcRed, 1x Luciferase) and 1 'empty' mock transfection controls grown in 5 replicates on 5 different plates ('empty' control has 35 replicates).
Most signaling constructs have a GFP tag. Typically only a subset of cells per sphere were overexpresing.
4 plates were pooled into one block with 240 well barcoding, 2 plates in one block with 120 well barcoding.
A pellet of each pool was generated and cut into several 6um thick sections
A subset of these sections (='sites') were stained with an IMC pane and acquired as 1 or more 'acquisitions' containing multiple spheres each.
Spheres in these acquisitions were identified via computer vision and croped into individual 'images'
In each image the following 'objects' were identified via computer vision:
'cell's (cell sections)
'nucleiexp' (slighly expanded cell centers around nuclei)
'cyto' (cytoplasm, cell region without nuclei) -> In the manuscript only 'cell' level data was used.
The data was exported using the 'anndata' csv format: https://anndata.readthedocs.io/en/stable/anndata.AnnData.html
Some notes on the files and their columns:
{object}_X.csv:
The data matrix
Shape: #objects x #features
column metadata: {object}_var.csv table
row metadata: {object}_obs.csv table
{object}_var.csv:
Variable metadata
For the paper mainly the compensated MeanIntensities (MeanIntensityComp) of an IMC image stack (FullStackFiltered) were used.
For 'cell' objects this export contains additionally measurements of min/max and mean Intensities from a pixel-wise compensated IMC image (FullStackComp), an Imunofluorescent image stack (IfStack, Dapi+GFP channel), a pixel-probability stack (ProbPos, channels: prop-pos, prop-neg) as well as as well as area and location features. Other important features:
distrim: Estimated distance to sphere border -> unit 'um'
Center_X/Y: Centroid of object in image -> unit 'um'
dist-sphere: distance to estimated spheroid section border
dist-other: distance to other spheroid section in image
dist-bg: distance to background pixels
- Shape: #features x #columns
- Columns:
- measurement_id: unique measurement id
- measurement_name: Name of measurement (this export: all compensated mean intensity)
- measurement_type: Type of measurement (this export: only Intensity features)
- channel_name, metal: Isotope name
- stack_name: multicolor image stack containing this channel
- ref_plane_number: position of the measured channel in it's image stack
- goodname: The name of the marker
no prefix: total protein
p-: phopho protein
[]: phospho residue
BC: barcoding metal
- Antibody Clone: antibody clone name
- is_cc: bool, indication if this marker is considered a classical cell cycle marker
- working: bool, indicates if the markers are working and of biological value. I would only look at the marker with working=1
Not important:
- scale: scale of raw data (data is already scaled)
- plane_id: database id for image plane.
{object}_obs.csv:
Object (cell/nuclei/cytoplasma section) level metadata. For the paper only 'cell' level data was used.
Shape: #objects x #columns
Columns:
object_id: Unique object id (unique also accross object types)
image_id: The key linking to the 'image_meta.csv' table
object_number: id corresponding to the object value in the segmentation mask
relations{source}{target}.csv:
Cell relationship graphs
Shape: #relations x #columns
Encoding relations between objects:
cell_neighbors: Neighbourhood graph:
object_id_cell: id of cell
object_id_neighbour: id of neighbor
cell_nuclei: Relationship between cells and nuclei
object_id_cell
object_id_nucleiexp -> This is not necessarily a 1:1 correspondence -cell_cyto: Relationship between cells and cytoplasm
object_id_cell
object_id_cyto -> This is not necessarily a 1:1 correspondence
image_meta.csv:
Image (=spheroid section) metadata
Shape: #images x #columns
Columns:
Image metadata:
image_id: The unique key of this table. Each row corresponds to a single spheroid section
image_shape_h/w: width/heigh of image in pixels/um
acquisition_id: unique id of IMC acquisition this image was cropped from
site_id: unique id of the section this sphere cut comes from.
All cuts in the same section were stained together.
slide_id: unique id for a single slide containing 1 or more sites
sampleblock_id: unique id of the sample block this sphere was pooled and processed in.
Not important:
image_number: original cellprofiler image number
crop_number: object number of the sphere that was used for this crop
image_pos_x/y: top left coordinate of crop of sphere from original acquisition
bc_depth: cells within this distance from border were considered for debarcoding
bc_invalid: number of invalid debarcoded objects in this sphere crop
bc_highest_count: number of cells assigned to the main barcode of this crop
bc_second_count: number of cells assigned to the second most frequent barcode of this crop
barcode: dictionary containing the barcode
bc_plate, bc_x, bc_y: barcode metadata
acquisition_mcd_acid: original MCD aquisition id
site_mcd_panoramaid: original MCD panorama id
acquisition_mcd_roiid: original MCD roiid
slideac_id/name: unique id for each aquisition of a slide. Corresponds to a single mcd file
slide_number: original number of slide this acquisition comes from
Experimental metadata:
condition_id: id of the physical spheroid the slice belongs to. Unique to each sphere replicate.
condition_name: name of the growth condition this sphere came from
plate_id: id of the plate the spheroid was grown in
well_name: position of the well the spheroid was grown in
sampleblock_id/sampleblock_name: id/name of the pooled block the spheroid was processed in
site_id: corresponds to the site the spheroid slice was located on. All spheroid slices in the same site were stained together.
file_name: filename of the segmentation mask found in masks_cell
Filenames:
maskfilename{object}: filename of the object mask corresponding to this image
image_stackfilename{imagestack}: filename of the image stack with this name. Note: all mean intensity measurements are usually done in the 'FullStackFiltered' (raw image with only filtered for strong outliers) and then compensated for metal impurities (as recomended in Chevrier, Zanotelli and Crowell 2018). For visualization and Min/Max measurements 'FullStackComp' can be used as there the image was corrected for metal impurities. The channel order is the same for both stacks.
Folder masks:
Folder containing the segmentation masks (See image_meta -> Filenames)
Folder images:
Folder containing the image stacks (See image_meta -> Filenames)
The mapping between channels and image planes number is given through the 'ref_plane_number' from the {object}_var.csv metadata. - Data or Study Types
- multiple
- Source Organization
- Unknown
- Access Conditions
- available
- Year
- 2020
- Access Hyperlink
- https://doi.org/10.5281/zenodo.4288515
Distributions
- Encoding Format: HTML ; URL: https://doi.org/10.5281/zenodo.4288515