wiki:ReferenceLoader

Reference File Loader Help

Most gene variant databases will use RefSeqGene or RefSeq transcript records as reference sequences. The Reference File Loader allows users to use reference files, which are not present in GenBank, or add information about alternative transcripts or proteins or additional genes contained within or derived from the reference sequence to an existing GenBank file. Users can upload their own reference sequence file in GenBank Flat file format, retrieve the genomic sequence of a gene with its flanking regions, or specify a chromosomal range for use as a reference sequence. Mutalyzer checks whether the file is in valid GenBank Flat file format. If so, Mutalyzer stores the file locally and returns a unique number the UD identifier that can be used with all different forms of the Mutalyzer Name Checker, Syntax Checker and the Name Generator.
Users are encouraged to limit their use of the Reference File Loader by submitting annotation updates and corrections of existing GenBank files following these instructions.

Reference File Loader options:

1) The reference sequence file is a local file

Browse to locate your GenBank Flat file with a .gb extension (e.g. U14680modified.gb) and press the submit button.

2) The reference sequence file can be found at the following URL

Enter the URL of the website, where the GenBank Flat file with a .gb extension can be found (e.g. http://MyServer/U14680modified.gb) and press the submit button.

3) Retrieve part of the reference genome for a (HGNC) gene symbol

This option retrieves part of the chromosomal reference sequence from the genome build used by Entrez Gene for the gene and organism specified.
The organism name should not contain any spaces (e.g., use homo_sapiens, human or man).
Please note that the genome build used may change without notification! For other builds, use option 4 below.

Input:

Please enter the Gene symbol and organism name without spaces and specify the length of the flanking sequences

Gene symbol BRCA1
Organism name man
Number of 5' flanking nucleotides 5000
Number of 3' flanking nucleotides 2000

4) Retrieve a range of a chromosome

Use of NC_accession numbers without version number will result in retrieval of the latest version from Entrez Nucleotide.
Select the reverse orientation if you wish to retrieve genomic sequences comparable to the corresponding RefSeq Gene records for genes encoded by the reverse strand.

Input:

Please enter the accession number of the chromosome or contig and specify the range

Chromosome Accession Number NC_000011.9
Start Position 111952571
Stop Position 111968518
Orientation Forward

Mutalyzer output for all options:

Output:
Your reference sequence was uploaded successfully. You now can use mutalyzer with the following accession number as reference: UD_127955523176
Download this reference sequence.

The unique UD identifiers are generated using time stamps. Mutalyzer caches these records, which increases the performance of nomenclature checks including those initiated by the LOVD Mutalyzer module. Mutalyzer will store metadata about the loading process. This prevents recurrent downloads of the same chromosomal slice from the NCBI. The advantage for the user is that the same unique UD identifier will be returned when reloading the same sequence. In addition, the metadata allow automatic retrieval of the same sequence in case the cache would be emptied accidentally.

The Reference File Loader uses JavaScript to change the form depending on the selected option. In Internet Explorer, forms may not be displayed correctly. Adding Mutalyzer to your trusted sites is one option to solve this.

Last modified 4 years ago Last modified on Feb 27, 2013 4:14:36 PM