Exabyte-io · capolanco · Oct 13, 2020 · Mar 28, 2022 · Mar 28, 2022 · Apr 1, 2022
diff --git a/Cloud-Infrastructure.md b/Cloud-Infrastructure.md
@@ -0,0 +1,41 @@
+# Cloud Infrastructure (HPC/DevOps)
+
+> Ideal candidate: skilled HPC engineer versed in cloud, HPC, and DevOps
+
+# Overview
+
+The aim of this task is to create a CI/CD pipeline (github workflow) that includes (i) deploying cloud infrastructure for cluster compute, (ii) configuring it for running HPC application(s), and (iii) running benchmarks for a set of distributed memory calculations.
+
+# Requirements
+
+1. A working CI/CD pipeline - e.g. GitHub action - able to deploy and configure an HPC cluster
+2. An automated workflow (using a configurable Github action) to benchmark one or more HPC application on one or more cloud instance type
+
+# Expectations
+
+- The application may be relatively simple - e.g. Linpack, this is focused more on infrastructure
+- Clean workflow logic
+
+# Timeline
+
+We leave exact timing to the candidate. Should fit Within 5 days total.
+
+# User story
+
+As a user of this CI/CD pipeline I can:
+
+- initiate tests for a specific number of scenarios: e.g. 2 nodes, 16 core per node
+- select the instance type to be used 
+
+# Notes
+
+- Commit early and often
+
+# Suggestions
+
+We suggest:
+
+- using AWS as the cloud provider
+- using Exabench as the source of benchmarks: https://github.com/Exabyte-io/exabyte-benchmarks-suite
+- using CentOS or similar as operating system
+- using Terraform for infrastructure management
diff --git a/Containerization-HPC.md b/Containerization-HPC.md
@@ -0,0 +1,44 @@
+# Containerization / Benchmarks (HPC)
+
+> Ideal candidate: skilled HPC engineer versed in HPC, and Containers
+
+# Overview
+
+The aim of this task is to build an HPC compatible container (i.e. [Singularity](https://sylabs.io/guides/3.5/user-guide/introduction.html)) and test its performance in comparison with a native installation (no containerization) for a set of distributed memory calculations.
+
+# Requirements
+
+1. A working deployment pipeline - using any preferred tool such as SaltStack, Terraform, CloudFormation - for building out the computational infrastructure
+2. A pipeline for building the HPC compatible container
+3. A set of benchmarks for one or more HPC application on one or more cloud instance type
+
+# Expectations
+
+- The application may be relatively simple - e.g. Linpack, this is focused more on infrastructure
+- Repeatable approach (no manual setup "in console")
+- Clean workflow logic
+
+# Timeline
+
+We leave exact timing to the candidate. Should fit Within 5 days total.
+
+# User story
+
+As a user of this pipeline I can:
+
+- build an HPC-compatible container for an HPC executable/code
+- run test calculations to assert working state of this container
+- (optional) compare the behavior of this container with a OS native installation
+
+# Notes
+
+- Commit early and often
+
+# Suggestions
+
+We suggest:
+
+- using AWS as the cloud provider
+- using Exabench as the source of benchmarks: https://github.com/Exabyte-io/exabyte-benchmarks-suite
+- using CentOS or similar as operating system
+- using Salstack, or Terraform, for infrastructure management
diff --git a/End-to-End-Tests.md b/End-to-End-Tests.md
@@ -0,0 +1,35 @@
+# End-to-end Tests (DevOps)
+
+> Ideal candidate: skilled software engineer versed in application infrastructure and DevOps
+
+# Overview
+
+The aim of this task is to create a simple application package (either python or javascript) that includes
+complete application testing infrastructure as well as a complete CICD solution using Github workflows.
+
+# Requirements
+
+1. A non-trivial application, e.g. a Flask server with a UI or a React app with a UI with testable components
+2. An appropriate end-to-end testing framework implementation (e.g. Cypress) for the application
+3. An automated workflow using Github actions to verify that the tests pass
+
+# Expectations
+
+- The application may be relatively simple, this is focused more on application infrastructure and DevOps, but the tests must actually verify functionality
+- Correctly passes the tests in automation and displays coverage metrics
+- Clean workflow logic
+
+# Timeline
+
+We leave exact timing to the candidate. Must fit Within 5 days total.
+
+# User story
+
+As a developer of this application I can:
+
+- view important coverage metrics of my application
+- be aware of the number of tests running/passing when developing
+
+# Notes
+
+- Commit early and often
diff --git a/README.md b/README.md
@@ -10,6 +10,18 @@ We find that regular job interview questions can often be misleading and so use
 
 Each file represents an assignment similar to what one would get when hired.
 
+| Focus          | ReWote                    | Keywords                        |
+| ---------------| --------------------------| ------------------------------- |
+| Comp. Science  | [Convergence Tracker](Convergence-Tracker.md) | Python, OOD, DFT, Planewaves    |
+| Comp. Science  | [Basis Set Selector](Basis-Set-Selector.md)  | Python, OOD, DFT, Local-orbital |
+| Data. Science  | [ML Property Predict](ML-Band-Gaps.md) | Python, ML Models, Scikit, Featurization |
+| Front-End / UX | [Materials Designer](Materials-Designer.md)  | ReactJS / UX Design, ThreeJS   |
+| Front-End / UX | [Flowchart Designer](Flowchart-Designer.md)  | ReactJS / UX Design, DAG       |
+| Back-End / Ops | [Parallel Uploader](Parallel-File-Uploader.md)   | Python, OOD, Threading, Objectstore |
+| CI/CD, DevOps  | [End-to-End Tests](End-To-End-Tests.md)    | BDD tests, CI/CD workflows, Cypress |
+| HPC, Cloud Inf | [Cloud HPC Bench.](Cloud-Infrastructure.md)    | HPC Cluster, Linpack, Benchmarks |
+| HPC, Containers| [Containerized HPC](Containerization-HPC.md)    | HPC Cluster, Containers, Benchmarks |
+
 ## Usage
 
 We suggest the following flow:
@@ -25,30 +37,30 @@ See [dev branch](https://github.com/Exabyte-io/rewotes/tree/dev) also.
 
 ## Notes
 
-Examples listed here are only meant as guidelines and do not necessarily reflect on the type of work to be performed at the company.
+Examples listed here are only meant as guidelines and do not necessarily reflect on the type of work to be performed at the company. Modifications to the individual assignments with an advance notice are encouraged.
 
-Modifications to the individual assignments with an advance notice are encouraged. Candidates are free to share the results.
+We will screen for the ability to (1) pick up new concepts quickly, (2) implement a working proof-of-concept solution, and (3) outline how the PoC can become more mature. We value attention to details and modularity.
 
-We will screen for the ability to pick up new concepts quickly and implement a working solution. We value attention to details and modularity.
 
 ## Hiring process
 
 Our hiring process in more details:
 
 | Stage             | Target Duration   | Topic                          |
 | ----------------- | ----------------- | ------------------------------ |
-| 0. Email screen   |                   | why exabyte.io                 |
+| 0. Email screen   |                   | why mat3ra.com / exabyte.io    |
 | 1. Phone screen   | 15-20 min         | career goals, basic skillset   |
-| 2. ReWoTe         | 1-2h x 1-5 days   | real-world work/thought process|
-| 3. On-site meet   | 2-4 x 30 min      | personality fit                |
+| 2. ReWoTe         | 1-2h x 2-5 days   | real-world work/thought process|
+| 3. On-site meet   | 3-4 x 30 min      | personality fit                |
 | 4. Discuss offer  | 30 min            | cash/equity/benefits           |
-| 5. Decision       |                   | when to start                  |
+| 5. References     | 2 x 15 min        | sanity check                   |
+| 6. Decision       |                   | when to start                  |
 
-TOTAL: ~2 weeks tentative
+TOTAL: ~2 weeks tentative.
 
 
 ## Contact info
 
-With any questions about this repository or our hiring process please contact us at info@exabyte.io.
+With any questions about this repository or our hiring process please contact us at info@mat3ra.com.
 
-© 2020 Exabyte Inc.
+© 2022 Exabyte Inc.
diff --git a/capolanco/README.md b/capolanco/README.md
@@ -0,0 +1,37 @@
+# K-point convergence tracker (Materials)
+
+> Ideal candidate: scientists skilled in Density Functional Theory and proficient in python.
+
+# Overview
+
+The aim of this task is to create a python package that implements automatic convergence tracking mechanism for a materials simulations engine. The convergence is tracked with respect to the k-point sampling inside a reciprocal cell of a crystalline compound.
+
+# Requirements
+
+1. automatically find the dimensions of a k-point mesh that satisfy a certain criteria for total energy (eg. total energy is converged within dE = 0.01meV)
+1. the code shall be written in a way that can facilitate easy addition of convergence wrt other characteristics extracted from simulations (forces, pressures, phonon frequencies etc)
+1. the code shall support VASP or Quantum ESPRESSO
+
+# Expectations
+
+- correctly find k-point mesh that satisfies total energy convergence parameters for a set of 10 materials, starting from Si2, as simplest, to a 10-20-atom supercell of your choice
+- modular and object-oriented implementation
+- commit early and often - at least once per 24 hours
+
+# Timeline
+
+We leave exact timing to the candidate. Must fit Within 5 days total.
+
+# User story
+
+As a user of this software I can start it passing:
+
+- path to input data (eg. pw.in / POSCAR, INCAR, KPOINTS) and
+- kinetic energy cutoff
+
+as parameters and get the k-point dimensions (eg. 5 5 5).
+
+# Notes
+
+- create an account at exabyte.io and use it for the calculation purposes
+- suggested modeling engine: Quantum ESPRESSO
diff --git a/capolanco/RunConvergence.py b/capolanco/RunConvergence.py
@@ -0,0 +1,121 @@
+# This code will run a k-point convergence up to a desire delta energy
+
+import subprocess
+import ioqeclass as qe
+import ioclusterclass as cluster
+
+###########################
+##### USER INPUTS
+###########################
+
+# Define the path of the input file
+# The k-grid of this file will be set as the starting point
+# for the convergence process
+filein='Si.scf.in'
+
+# Define delta energy threshold for 
+# convergence (eV)
+dEthreshold=1.0 # (eV)
+
+
+###########################
+##### DEVELOPERS INPUTS
+###########################
+
+## Convergence
+# maximum of iterations
+Nitermax=20
+# increasing step of kgrid
+kstep=2
+
+## Cluster
+# number of nodes to be used
+Nnodes=1
+# number of processors per node
+ppn=8
+# queue
+queue='OR'
+# walltime
+walltimehours=5
+walltimeminutes=3
+
+
+###########################
+##### PROGRAM
+###########################
+
+
+### Set up initial parameters
+# Initialize pw.x input class
+qeinput=qe.qepwinput()
+# load the pw.x input 
+qeinput.load(filein)
+# Create the input file for initial run
+testin='test.scf.in'
+testout='test.scf.out'
+qeinput.save(filein,testin)
+# Create the job to send to the cluster
+# Initialize job class
+job=cluster.jobclass()
+# set up job class
+job.name='test'
+job.nodes=Nnodes
+job.ppn=ppn
+job.queue=queue
+job.walltimehours=walltimehours
+job.walltimeminutes=walltimeminutes
+# create the job file
+jobname=f'job.test.sh'
+job.createjobQEpw(jobname,testin,testout)
+# Run initial test
+subprocess.run(['echo',f'runing {jobname}'])
+##subprocess.run(['qsub',f'{jobname}'])
+
+# Initialize pw.x output class
+qeoutput=qe.qepwoutput()
+# Read the Total energy from the output
+qeoutput.getenergy(testout)
+
+
+# Loop testing for dE
+dE=2.0*dEthreshold
+EnergyOld=qeoutput.energy
+counter=0
+while ((dE>dEthreshold) and (counter<Nitermax)):
+
+    # Increase counter
+    counter=counter+1
+    print(f'\n## Iteration {counter}')
+
+    # Increase k grid
+    qeinput.kgrid=qeinput.kgrid+kstep
+
+    # Create the input file for initial run
+    testin=f'test.scf{counter}.in'
+    testout=f'test.scf{counter}.out'
+    qeinput.save(filein,testin)
+
+    # create the job file
+    jobname=f'job.test{counter}.sh'
+    job.createjobQEpw(jobname,testin,testout)
+    # Run QE calculation
+    subprocess.run(['echo',f'runing {jobname}'])
+    ##subprocess.run(['qsub',f'{jobname}'])
+
+    # Read the Total energy from the output
+    qeoutput.getenergy(testout)
+
+    # Update dE and EnergyOld
+    dE=abs(EnergyOld-qeoutput.energy)
+    EnergyOld=qeoutput.energy
+    print(f'dE {dE} eV')
+
+# Display results
+if (dE<dEthreshold):
+    print(f'Convergence has been achieved in {counter} iterations')
+    print(f'for kgrid {qeinput.kgrid}')
+    print(f'The total enegy change is less than {dEthreshold} eV')
+else:
+    print(f'Convergence has NOT been achieved in {counter} iterations')
+
+
diff --git a/capolanco/Si.scf.in b/capolanco/Si.scf.in
@@ -0,0 +1,30 @@
+ &control
+   calculation='scf',
+   outdir = '/global/cscratch1/sd/cpolanco/Si/', 
+   prefix='Si', 
+   pseudo_dir = '/global/homes/c/cpolanco/Materials/pseudo/',
+   restart_mode='from_scratch',
+   tprnfor= .true.,
+   tstress= .true.,
+ /
+ &system
+   ibrav= 2, 
+   celldm(1)= 10.20777693, 
+   nat= 2, 
+   ntyp= 1,
+   ecutwfc= 100.0
+ /
+ &electrons
+   mixing_beta= 0.7
+   conv_thr= 1.0d-14
+ /
+
+ATOMIC_SPECIES
+ Si  28.086  Si.pz-vbc.UPF
+
+ATOMIC_POSITIONS (alat)
+ Si 0.00 0.00 0.00
+ Si 0.25 0.25 0.25
+
+K_POINTS (automatic)
+  8 8 8 0 0 0