Validation Report

eva-sub-cli vcligeneratedversion
You requested to run the shallow validation, please run full validation before submitting the data
VCF File Variant lines validated in VCF Entries used in Fasta
input_fail.vcf 10000 24
input_passed.vcf 10000 24

Project Summary

General details about the project

Project Title: My cool project

Validation Date: 2023-08-31 12:34:56

Submission Directory: /test/submission/dir

Files mapping
VCF File Fasta File Analysis
input_fail.vcf input_fail.fa A
input_pass.vcf input_pass.fa B
input_test.vcf input_test.fa could not be linked

Metadata validation results

Ensures that required fields are present and values are formatted correctly. For requirements, please refer to the EVA website.
❌ Metadata validation check
Full report: /path/to/metadata/metadata_spreadsheet_validation.txt
SheetRowColumnDescription
Files Sheet "Files" is missing
Project 2 Project Title Column "Project Title" is not populated
Project 2 Description Column "Description" is not populated
Project 2 Tax ID Column "Tax ID" is not populated
Project 2 Center Column "Center" is not populated
Analysis 2 Analysis Title Column "Analysis Title" is not populated
Analysis 2 Description Column "Description" is not populated
Analysis 2 Experiment Type Column "Experiment Type" is not populated
Analysis 2 Reference Column "Reference" is not populated
Sample 3 Sample Accession Column "Sample Accession" is not populated

VCF validation results

Checks whether each file is compliant with the VCF specification. Also checks whether the variants' reference alleles match against the reference assembly.

input_fail.vcf

❌ Assembly check: 26/36 (72.22%)
First 10 errors per category are below. Full report: /path/to/assembly_failed/report
CategoryError
Parsing Error The assembly checking could not be completed: Contig 'chr23' not found in assembly report
mismatch error Chromosome 1, position 35549, reference allele 'G' does not match the reference sequence, expected 'c'
mismatch error Chromosome 1, position 35595, reference allele 'G' does not match the reference sequence, expected 'a'
mismatch error Chromosome 1, position 35618, reference allele 'G' does not match the reference sequence, expected 'c'
mismatch error Chromosome 1, position 35626, reference allele 'A' does not match the reference sequence, expected 'g'
mismatch error Chromosome 1, position 35639, reference allele 'T' does not match the reference sequence, expected 'c'
mismatch error Chromosome 1, position 35643, reference allele 'T' does not match the reference sequence, expected 'g'
mismatch error Chromosome 1, position 35717, reference allele 'T' does not match the reference sequence, expected 'g'
mismatch error Chromosome 1, position 35819, reference allele 'T' does not match the reference sequence, expected 'a'
mismatch error Chromosome 1, position 35822, reference allele 'T' does not match the reference sequence, expected 'c'
❌ VCF check: 1 critical errors, 1 non-critical errors
First 10 errors per category are below. Full report: /path/to/vcf_failed/report
CategoryError
critical error Line 4: Error in meta-data section.
non-critical error Sample #11, field AD does not match the meta specification Number=R (expected 2 value(s)). AD=..

input_passed.vcf

✔ Assembly check: 247/247 (100.0%)
✔ VCF check: 0 critical errors, 0 non-critical errors

Sample name concordance check

Checks whether information in the metadata is concordant with that contained in the VCF files, in particular sample names.
Analysis A: Sample names in metadata do not match with those in VCF files
CategoryFirst 5 Errors For CategoryLink To View All Errors
Samples described in the metadata but not in the VCF files SampleA1, SampleA2 , SampleA3, SampleA4, SampleA5 Show All Errors For Category
Samples in the VCF files but not described in the metadata A1Sample , A2Sample, A3Sample, A4Sample, A5Sample Show All Errors For Category
All Errors For Category - Samples described in the metadata but not in the VCF files:
  1. •SampleA1
  2. SampleA2•
  3. SampleA3
  4. SampleA4
  5. SampleA5
  6. SampleA6
  7. SampleA7
  8. SampleA8
  9. SampleA9
  10. SampleA10
Hide
All Errors For Category - Samples in the VCF files but not described in the metadata:
  1. A1Sample•
  2. •A2Sample
  3. A3Sample
  4. A4Sample
  5. A5Sample
  6. A6Sample
  7. A7Sample
  8. A8Sample
  9. A9Sample
  10. A10Sample
Hide
Analysis B: Sample names in metadata match with those in VCF files
Analysis C: Sample names in metadata do not match with those in VCF files
CategoryFirst 5 Errors For CategoryLink To View All Errors
Samples described in the metadata but not in the VCF files SampleC1 , SampleC2, SampleC3, SampleC4 Show All Errors For Category
Samples in the VCF files but not described in the metadata C1Sample , C2Sample, C3Sample, C4Sample Show All Errors For Category
All Errors For Category - Samples described in the metadata but not in the VCF files:
  1. SampleC1•
  2. •SampleC2
  3. SampleC3
  4. SampleC4
Hide
All Errors For Category - Samples in the VCF files but not described in the metadata:
  1. C1Sample•
  2. •C2Sample
  3. C3Sample
  4. C4Sample
Hide

Reference genome INSDC check

Checks that the reference sequences in the FASTA file used to call the variants are accessioned in INSDC. Also checks if the reference assembly accession in the metadata matches the one determined from the FASTA file.

metadata_asm_match.fa

✔ All sequences are INSDC accessioned
✔ Analysis A: Assembly accession in metadata is compatible

metadata_asm_not_found.fa

✔ All sequences are INSDC accessioned
❌ No assembly accession found in metadata
Full report: /path/to/metadata_asm_not_found.yml
CategoryAccessions
Assembly accession found in metadata Not found
Assembly accession(s) compatible with FASTA GCA_1

metadata_asm_not_match.fa

✔ All sequences are INSDC accessioned
❌ Analysis B: Assembly accession in metadata is not compatible
Full report: /path/to/metadata_asm_not_match.yml
CategoryAccessions
Assembly accession found in metadata GCA_2
Assembly accession(s) compatible with FASTA GCA_1

metadata_error.fa

Warning: The following results may be incomplete due to problems with external services. Please try again later for complete results.
Error message: 500 Server Error: Internal Server Error for url: https://www.ebi.ac.uk/eva/webservices/contig-alias/v1/chromosomes/md5checksum/hjfdoijsfc47hfg0gh9qwjrve
✔ All sequences are INSDC accessioned
✔ Analysis C: Assembly accession in metadata is compatible

not_all_insdc.fa

❌ Some sequences are not INSDC accessioned
First 10 sequences not in INSDC. Full report: /path/to/not_all_insdc_check.yml
Sequence nameRefget md5
2 hjfdoijsfc47hfg0gh9qwjrve
✔ Analysis A: Assembly accession in metadata is compatible