Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions Cloud-Infrastructure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Cloud Infrastructure (HPC/DevOps)

> Ideal candidate: skilled HPC engineer versed in cloud, HPC, and DevOps

# Overview

The aim of this task is to create a CI/CD pipeline (github workflow) that includes (i) deploying cloud infrastructure for cluster compute, (ii) configuring it for running HPC application(s), and (iii) running benchmarks for a set of distributed memory calculations.

# Requirements

1. A working CI/CD pipeline - e.g. GitHub action - able to deploy and configure an HPC cluster
2. An automated workflow (using a configurable Github action) to benchmark one or more HPC application on one or more cloud instance type

# Expectations

- The application may be relatively simple - e.g. Linpack, this is focused more on infrastructure
- Clean workflow logic

# Timeline

We leave exact timing to the candidate. Should fit Within 5 days total.

# User story

As a user of this CI/CD pipeline I can:

- initiate tests for a specific number of scenarios: e.g. 2 nodes, 16 core per node
- select the instance type to be used

# Notes

- Commit early and often

# Suggestions

We suggest:

- using AWS as the cloud provider
- using Exabench as the source of benchmarks: https://github.com/Exabyte-io/exabyte-benchmarks-suite
- using CentOS or similar as operating system
- using Terraform for infrastructure management
44 changes: 44 additions & 0 deletions Containerization-HPC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Containerization / Benchmarks (HPC)

> Ideal candidate: skilled HPC engineer versed in HPC, and Containers

# Overview

The aim of this task is to build an HPC compatible container (i.e. [Singularity](https://sylabs.io/guides/3.5/user-guide/introduction.html)) and test its performance in comparison with a native installation (no containerization) for a set of distributed memory calculations.

# Requirements

1. A working deployment pipeline - using any preferred tool such as SaltStack, Terraform, CloudFormation - for building out the computational infrastructure
2. A pipeline for building the HPC compatible container
3. A set of benchmarks for one or more HPC application on one or more cloud instance type

# Expectations

- The application may be relatively simple - e.g. Linpack, this is focused more on infrastructure
- Repeatable approach (no manual setup "in console")
- Clean workflow logic

# Timeline

We leave exact timing to the candidate. Should fit Within 5 days total.

# User story

As a user of this pipeline I can:

- build an HPC-compatible container for an HPC executable/code
- run test calculations to assert working state of this container
- (optional) compare the behavior of this container with a OS native installation

# Notes

- Commit early and often

# Suggestions

We suggest:

- using AWS as the cloud provider
- using Exabench as the source of benchmarks: https://github.com/Exabyte-io/exabyte-benchmarks-suite
- using CentOS or similar as operating system
- using Salstack, or Terraform, for infrastructure management
35 changes: 35 additions & 0 deletions End-to-End-Tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# End-to-end Tests (DevOps)

> Ideal candidate: skilled software engineer versed in application infrastructure and DevOps

# Overview

The aim of this task is to create a simple application package (either python or javascript) that includes
complete application testing infrastructure as well as a complete CICD solution using Github workflows.

# Requirements

1. A non-trivial application, e.g. a Flask server with a UI or a React app with a UI with testable components
2. An appropriate end-to-end testing framework implementation (e.g. Cypress) for the application
3. An automated workflow using Github actions to verify that the tests pass

# Expectations

- The application may be relatively simple, this is focused more on application infrastructure and DevOps, but the tests must actually verify functionality
- Correctly passes the tests in automation and displays coverage metrics
- Clean workflow logic

# Timeline

We leave exact timing to the candidate. Must fit Within 5 days total.

# User story

As a developer of this application I can:

- view important coverage metrics of my application
- be aware of the number of tests running/passing when developing

# Notes

- Commit early and often
32 changes: 22 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,18 @@ We find that regular job interview questions can often be misleading and so use

Each file represents an assignment similar to what one would get when hired.

| Focus | ReWote | Keywords |
| ---------------| --------------------------| ------------------------------- |
| Comp. Science | [Convergence Tracker](Convergence-Tracker.md) | Python, OOD, DFT, Planewaves |
| Comp. Science | [Basis Set Selector](Basis-Set-Selector.md) | Python, OOD, DFT, Local-orbital |
| Data. Science | [ML Property Predict](ML-Band-Gaps.md) | Python, ML Models, Scikit, Featurization |
| Front-End / UX | [Materials Designer](Materials-Designer.md) | ReactJS / UX Design, ThreeJS |
| Front-End / UX | [Flowchart Designer](Flowchart-Designer.md) | ReactJS / UX Design, DAG |
| Back-End / Ops | [Parallel Uploader](Parallel-File-Uploader.md) | Python, OOD, Threading, Objectstore |
| CI/CD, DevOps | [End-to-End Tests](End-To-End-Tests.md) | BDD tests, CI/CD workflows, Cypress |
| HPC, Cloud Inf | [Cloud HPC Bench.](Cloud-Infrastructure.md) | HPC Cluster, Linpack, Benchmarks |
| HPC, Containers| [Containerized HPC](Containerization-HPC.md) | HPC Cluster, Containers, Benchmarks |

## Usage

We suggest the following flow:
Expand All @@ -25,30 +37,30 @@ See [dev branch](https://github.com/Exabyte-io/rewotes/tree/dev) also.

## Notes

Examples listed here are only meant as guidelines and do not necessarily reflect on the type of work to be performed at the company.
Examples listed here are only meant as guidelines and do not necessarily reflect on the type of work to be performed at the company. Modifications to the individual assignments with an advance notice are encouraged.

Modifications to the individual assignments with an advance notice are encouraged. Candidates are free to share the results.
We will screen for the ability to (1) pick up new concepts quickly, (2) implement a working proof-of-concept solution, and (3) outline how the PoC can become more mature. We value attention to details and modularity.

We will screen for the ability to pick up new concepts quickly and implement a working solution. We value attention to details and modularity.

## Hiring process

Our hiring process in more details:

| Stage | Target Duration | Topic |
| ----------------- | ----------------- | ------------------------------ |
| 0. Email screen | | why exabyte.io |
| 0. Email screen | | why mat3ra.com / exabyte.io |
| 1. Phone screen | 15-20 min | career goals, basic skillset |
| 2. ReWoTe | 1-2h x 1-5 days | real-world work/thought process|
| 3. On-site meet | 2-4 x 30 min | personality fit |
| 2. ReWoTe | 1-2h x 2-5 days | real-world work/thought process|
| 3. On-site meet | 3-4 x 30 min | personality fit |
| 4. Discuss offer | 30 min | cash/equity/benefits |
| 5. Decision | | when to start |
| 5. References | 2 x 15 min | sanity check |
| 6. Decision | | when to start |

TOTAL: ~2 weeks tentative
TOTAL: ~2 weeks tentative.


## Contact info

With any questions about this repository or our hiring process please contact us at info@exabyte.io.
With any questions about this repository or our hiring process please contact us at info@mat3ra.com.

© 2020 Exabyte Inc.
© 2022 Exabyte Inc.
37 changes: 37 additions & 0 deletions capolanco/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# K-point convergence tracker (Materials)

> Ideal candidate: scientists skilled in Density Functional Theory and proficient in python.

# Overview

The aim of this task is to create a python package that implements automatic convergence tracking mechanism for a materials simulations engine. The convergence is tracked with respect to the k-point sampling inside a reciprocal cell of a crystalline compound.

# Requirements

1. automatically find the dimensions of a k-point mesh that satisfy a certain criteria for total energy (eg. total energy is converged within dE = 0.01meV)
1. the code shall be written in a way that can facilitate easy addition of convergence wrt other characteristics extracted from simulations (forces, pressures, phonon frequencies etc)
1. the code shall support VASP or Quantum ESPRESSO

# Expectations

- correctly find k-point mesh that satisfies total energy convergence parameters for a set of 10 materials, starting from Si2, as simplest, to a 10-20-atom supercell of your choice
- modular and object-oriented implementation
- commit early and often - at least once per 24 hours

# Timeline

We leave exact timing to the candidate. Must fit Within 5 days total.

# User story

As a user of this software I can start it passing:

- path to input data (eg. pw.in / POSCAR, INCAR, KPOINTS) and
- kinetic energy cutoff

as parameters and get the k-point dimensions (eg. 5 5 5).

# Notes

- create an account at exabyte.io and use it for the calculation purposes
- suggested modeling engine: Quantum ESPRESSO
121 changes: 121 additions & 0 deletions capolanco/RunConvergence.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# This code will run a k-point convergence up to a desire delta energy

import subprocess
import ioqeclass as qe
import ioclusterclass as cluster

###########################
##### USER INPUTS
###########################

# Define the path of the input file
# The k-grid of this file will be set as the starting point
# for the convergence process
filein='Si.scf.in'

# Define delta energy threshold for
# convergence (eV)
dEthreshold=1.0 # (eV)


###########################
##### DEVELOPERS INPUTS
###########################

## Convergence
# maximum of iterations
Nitermax=20
# increasing step of kgrid
kstep=2

## Cluster
# number of nodes to be used
Nnodes=1
# number of processors per node
ppn=8
# queue
queue='OR'
# walltime
walltimehours=5
walltimeminutes=3


###########################
##### PROGRAM
###########################


### Set up initial parameters
# Initialize pw.x input class
qeinput=qe.qepwinput()
# load the pw.x input
qeinput.load(filein)
# Create the input file for initial run
testin='test.scf.in'
testout='test.scf.out'
qeinput.save(filein,testin)
# Create the job to send to the cluster
# Initialize job class
job=cluster.jobclass()
# set up job class
job.name='test'
job.nodes=Nnodes
job.ppn=ppn
job.queue=queue
job.walltimehours=walltimehours
job.walltimeminutes=walltimeminutes
# create the job file
jobname=f'job.test.sh'
job.createjobQEpw(jobname,testin,testout)
# Run initial test
subprocess.run(['echo',f'runing {jobname}'])
##subprocess.run(['qsub',f'{jobname}'])

# Initialize pw.x output class
qeoutput=qe.qepwoutput()
# Read the Total energy from the output
qeoutput.getenergy(testout)


# Loop testing for dE
dE=2.0*dEthreshold
EnergyOld=qeoutput.energy
counter=0
while ((dE>dEthreshold) and (counter<Nitermax)):

# Increase counter
counter=counter+1
print(f'\n## Iteration {counter}')

# Increase k grid
qeinput.kgrid=qeinput.kgrid+kstep

# Create the input file for initial run
testin=f'test.scf{counter}.in'
testout=f'test.scf{counter}.out'
qeinput.save(filein,testin)

# create the job file
jobname=f'job.test{counter}.sh'
job.createjobQEpw(jobname,testin,testout)
# Run QE calculation
subprocess.run(['echo',f'runing {jobname}'])
##subprocess.run(['qsub',f'{jobname}'])

# Read the Total energy from the output
qeoutput.getenergy(testout)

# Update dE and EnergyOld
dE=abs(EnergyOld-qeoutput.energy)
EnergyOld=qeoutput.energy
print(f'dE {dE} eV')

# Display results
if (dE<dEthreshold):
print(f'Convergence has been achieved in {counter} iterations')
print(f'for kgrid {qeinput.kgrid}')
print(f'The total enegy change is less than {dEthreshold} eV')
else:
print(f'Convergence has NOT been achieved in {counter} iterations')


30 changes: 30 additions & 0 deletions capolanco/Si.scf.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
&control
calculation='scf',
outdir = '/global/cscratch1/sd/cpolanco/Si/',
prefix='Si',
pseudo_dir = '/global/homes/c/cpolanco/Materials/pseudo/',
restart_mode='from_scratch',
tprnfor= .true.,
tstress= .true.,
/
&system
ibrav= 2,
celldm(1)= 10.20777693,
nat= 2,
ntyp= 1,
ecutwfc= 100.0
/
&electrons
mixing_beta= 0.7
conv_thr= 1.0d-14
/

ATOMIC_SPECIES
Si 28.086 Si.pz-vbc.UPF

ATOMIC_POSITIONS (alat)
Si 0.00 0.00 0.00
Si 0.25 0.25 0.25

K_POINTS (automatic)
8 8 8 0 0 0
Loading