This repository hosts the back end code for the Integrated Microscopy and Proteomics (IMP) Platform. Other repositories include the Landing Page and the Cryoglancer Application, which is a modified neuroglancer viewer.
The repository contains:
- files marked 
deprecated_- Deprecated code from previous iterations of the pipeline - this is kept for reference for reference for when this functionality is reimplemented and added to the existing system (files marked 
deprecated_) 
 - Deprecated code from previous iterations of the pipeline - this is kept for reference for reference for when this functionality is reimplemented and added to the existing system (files marked 
 - An 
environment.ymlfor the python environment that the pipeline uses - A 
Dockerfileanddocker-compose.ymlthat builds a container that processes incoming datasets - expressjs
- The backend for communicating with the Mongo database that stores the dataset metadata
 - This is a modified example project and so while this section does serve the service adequately at its current scale it is in need of restructure
 
 - multiresolution-mesh-creator
- A submodule which the pipeline depends on
 
 - nginx
- An example NginX config and Docker for hosting datasets for the Cryoglancer viewer to access
 
 - passthrough_api
- An experiment which uses fast_api to pass through any commands to the docker container
 - Intended for use with a Relion Docker environment in the future to allow for users to upload just the tilt series rather than a full volume
 
 - pipeline
- The code used to process IMP datasets into something that can be viewed by the platform
 
 
- Prior to upload the user will prepare the volume, objects and any proteomics that they wish to share for visualisation
 - Currently the full volume must be uploaded and any object coordinates need to be processed as a .csv in the same coordinate system
 - The platform also supports attaching additional files to the dataset, such as tilt series or .star files along with the dataset
 - Datasets are uploaded to the IMP platform using the MyData Client
 - The pipeline watches for new datasets to arrive using Watchdog
 - Datasets are processed into neuroglancer's precomuted format
 - Users are alerted by email when the processing is complete or if it has failed
 - If the dataset processed successfully it will be available for viewing using the Cryoglancer Portal
 
The pipeline expects an input folder with the following format:
metadata.jsonname: The name of the dataset -stringdescription: The description of the dataset -stringparent_volume: The filename of the parent volume -.mrcobject_volumes: A list of the filenmaes of the objects -[.mrc]object_coordinates: The filename of the coordinates table -.csvobject_names: A list of human readable object names - should match the length of the filenames inobject_volumes-[string]subclasses: The names of any additional columns in theobject_coordinatestable for visualisation -[string]proteomics: A table that encodes the Majority Protein IDs and iBAQ of the dataset -.csvother_files: A list of any additional files for sharing along with the dataset[string]orcid: Your ORCiD in "0123-4567-8901-2345" format[string]doi_attributes: The attributes of the doi for minting[json]"doi_attributes": { "creators": [{ "name": "Your research group" }], "titles": [{ "title": "Test Dataset" }], "publisher": "IMP Platform", "publication_year": 2023 }
parent_volume.mrc- The volume for the objects to be placed in
 - Take care that the coordinate system matches the object coordinates
 
object_volumes.mrc- One or more objects to be placed in the volume
 
object_coordinates.csv- A table of coordinates and euler angles for the objects to be placed in the parent volume
 - Columns:
- Position Coordinates - 
x,y,z - Euler Angles - 
eux,euy,euz - Object volume filename - 
mrcfile - Human readable name of the corresponding object - 
name index- The index of the particle as it corresponds to the object list- Any additional subclasses you wish to visualise, as referenced by the list of subclasses
 
 - Position Coordinates - 
 
proteomics.csv- A table of the proteomics information
 - Columns:
- Majority Protein IDs
 - iBAQ
 - Any additional information for sharing
- Note that while only Majority Protein IDs and iBAQ will be shown by the IMP platform, any extra columns will still be present for browsing if the file is downloaded
 
 
 
- Any other files listed
 
An example input dataset has been provided at /example/object_input
An NginX config file to host on localhost is provided. The server has to be able to serve compressed files and overcome a few caveats with the filenames, therefore simplehttpserver wasn't sufficient.
You can install nginx on linux with:
apt-get install nginx
or on MAC:
brew install nginx
Add the config file at /etc/nginx/sites-available/ with a symlink to /etc/nginx/sites-enabled/. Edit the config file to point to the folder you want to serve. Useful commands for management include:
service nginx start
service nginx restart
service nginx stop
