Skip to content

Commit 55ea7d4

Browse files
author
Daniel Schreyer
authored
Merge pull request #471 from nf-core/revert-470-circdna
Revert "Circdna"
2 parents dece64a + 3cd7a46 commit 55ea7d4

13 files changed

+246
-202705
lines changed

README.md

Lines changed: 21 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,71 +1,36 @@
1-
# test-datasets: `circdna`
1+
# ![nfcore/test-datasets](docs/images/test-datasets_logo.png)
2+
Test data to be used for automated testing with the nf-core pipelines
23

3-
This branch contains test data to be used for automated testing with the [nf-core/circdna](https://github.com/nf-core/circdna) pipeline.
4+
## Introduction
45

5-
## Content of this repository
6+
nf-core is a collection of high quality Nextflow pipelines. This repository contains various files for CI and unit testing of nf-core pipelines and infrastructure.
67

7-
`reference/`: Genome reference files (iGenomes R64-1-1 Ensembl release)
8+
The principle for nf-core test data is as small as possible, as large as necessary. Always ask for guidance on the [nf-core slack](https://nf-co.re/join) before adding new test data.
89

9-
`testdata/` : 200,000 FastQ paired-end reads
10+
## Documentation
1011

11-
## Minimal test dataset origin
12-
The data set was generated using Circle-Map Simulate (see [Circle-Map](https://github.com/iprada/Circle-Map) and InSilicoSeq (see [InSilicoSeq](https://github.com/HadrienG/InSilicoSeq). Circle-Map simulated 120,000 paired-end reads originated from circle-seq data and InSilicoSeq simulated 80,000 random reads from the reference genome.
12+
nf-core/test-datasets comes with documentation in the `docs/` directory:
1313

14-
### Data Generation
14+
01. [Add a new test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/ADD_NEW_DATA.md)
15+
02. [Use an existing test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/USE_EXISTING_DATA.md)
1516

16-
The example below was used to generate the raw paired-end FastQ files.
17+
## Downloading test data
1718

18-
``` bash
19-
Circle-Map Simulate -c 200 -g genome.fa -N 120000 -r 150 -b cm_1 -p 10
20-
Circle-Map Simulate -c 200 -g genome.fa -N 120000 -r 150 -b cm_2 -p 10
21-
Circle-Map Simulate -c 200 -g genome.fa -N 120000 -r 150 -b cm_3 -p 10
22-
wgsim -1 150 -2 150 -N 80000 genome.fa wgsim_1_R1.fastq wgsim_1_R2.fastq -S 1
23-
wgsim -1 150 -2 150 -N 80000 genome.fa wgsim_2_R1.fastq wgsim_2_R2.fastq -S 1
24-
wgsim -1 150 -2 150 -N 80000 genome.fa wgsim_3_R1.fastq wgsim_3_R2.fastq -S 1
25-
cat cm_1_2.fastq wgsim_1_R2.fastq | gzip --no-name > ../testdata/circdna_1_R2.fastq.gz
26-
cat cm_2_2.fastq wgsim_2_R2.fastq | gzip --no-name > ../testdata/circdna_2_R2.fastq.gz
27-
cat cm_3_2.fastq wgsim_3_R2.fastq | gzip --no-name > ../testdata/circdna_3_R2.fastq.gz
19+
Due the large number of large files in this repository for each pipeline, we highly recommend cloning only the branches you would use.
2820

29-
cat cm_1_1.fastq wgsim_1_R1.fastq | gzip --no-name > ../testdata/circdna_1_R1.fastq.gz
30-
cat cm_2_1.fastq wgsim_2_R1.fastq | gzip --no-name > ../testdata/circdna_2_R1.fastq.gz
31-
cat cm_3_1.fastq wgsim_3_R1.fastq | gzip --no-name > ../testdata/circdna_3_R1.fastq.gz
21+
```bash
22+
git clone <url> --single-branch --branch <pipeline/modules/branch_name>
3223
```
3324

34-
### Expected output
25+
To subsequently clone other branches[^1]
3526

36-
To track and test the reproducibility of the pipeline with default parameters below are some of the expected outputs.
37-
38-
### Number of `Circle-Map Realign` circles
39-
40-
| sample | circles |
41-
|-----------------------|-------|
42-
| circdna_1 | 275 |
43-
| circdna_2 | 280 |
44-
| circdna_3 | 279 |
45-
46-
### Number of `Circexplorer2` circles
47-
48-
| sample | circles |
49-
|-----------------------|-------|
50-
| circdna_1 | 392 |
51-
| circdna_2 | 328 |
52-
| circdna_3 | 393 |
53-
54-
### Number of `circle_finder` circles
55-
56-
| sample | circles |
57-
|-----------------------|-------|
58-
| circdna_1 | 267 |
59-
| circdna_2 | 275 |
60-
| circdna_3 | 266 |
61-
62-
### Number of `unicycler` circles
27+
```bash
28+
git remote set-branches --add origin [remote-branch]
29+
git fetch
30+
```
6331

64-
| sample | circles |
65-
|-----------------------|-------|
66-
| circdna_1 | 1 |
67-
| circdna_2 | 0 |
68-
| circdna_3 | 0 |
32+
## Support
6933

70-
These are just guidelines and will change with the use of different software, and with any restructuring of the pipeline away from the current defaults.
34+
For further information or help, don't hesitate to get in touch on our [Slack organisation](https://nf-co.re/join/slack) (a tool for instant messaging).
7135

36+
[^1]: From [stackoverflow](https://stackoverflow.com/a/60846265/11502856)

docs/ADD_NEW_DATA.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# How to add and use new test dataset
2+
3+
Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested when adding a new test dataset.
4+
5+
- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) that there isn't already a branch containing data that could be used
6+
- If this is the case, follow the [documentation on how to use an existing test dataset](https://github.com/nf-core/test-datasets/blob/master/docs/USE_EXISTING_DATA.md)
7+
- [ ] Fork the [nf-core/test-datasets repository](https://github.com/nf-core/test-datasets) to your GitHub account
8+
- [ ] Create a new branch on your fork
9+
- [ ] Add your test dataset
10+
- [ ] If you clone it locally use `git clone <url> --branch <branch> --single-branch`
11+
- [ ] Make a PR on a new branch with a relevant name
12+
- [ ] Wait for the PR to be merged
13+
- [ ] Use this newly created branch for your tests

docs/USE_EXISTING_DATA.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# How to use an existing test dataset
2+
3+
Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested when adding a new test dataset.
4+
5+
- [ ] Check [here](https://github.com/nf-core/test-datasets/branches/all) to find the branch corresponding to the test dataset you want to use
6+
- [ ] Specify in the `conf/test.config` the path to the files from the test dataset
7+
- [ ] Set up your CI tests following the nf-core best practices (cf [.github/workflows/ci.yml template](https://github.com/nf-core/tools/blob/dev/nf_core/pipeline-template/.github/workflows/ci.yml))

docs/images/test-datasets_logo.png

19 KB
Loading

docs/images/test-datasets_logo.svg

Lines changed: 205 additions & 0 deletions
Loading

0 commit comments

Comments
 (0)