SimpleSearch help

 

The GABI-Kat "SimpleSearch" database contains information about the FSTs that have been produced in the context of GABI. The sequences included have a significant similarity with sequences from the Arabidopsis thaliana genome. The sequences have been quality-trimmed, and the T-DNA part of the sequence has been removed.

There are three kinds of search possible:
1a) text-based search for gene hits (up to 20 AGI-codes at a time, with SPACE as separator),
1b) text-based search for line ID or GenBank accession number, 
2) pseudochromosome position based search for a genome area (range of positions), 
3) sequence-based search using BLAST.


The "gene hit search" addresses FSTs which qualify as gene hits (see GABI-Kat FAQ for our definition of a "gene hit"). Genomic DNA is considered, so you will also find hits in the introns of these genes. You can either enter the AGI gene code, or a keyword to search the gene annotation text (you need to choose one of them). If you search by gene code, you can enter up to 20 A. thaliana gene codes at the same time. Separate the codes by SPACE. The keyword search allows you to do a substring search of the gene annotation text (see here for details on the data source).

Searching for a line ID or the GenBank accession number of a FSTs will directly lead to the FST page. On top of the FST page, the line specific information is displayed, such as availability and segregation analysis, followed by information for one or more FST hits found for the given line. For each hit the FST sequence, the respective sequenced BAC (more exactly: annotation unit), and information about the predicted insertion position are displayed. A link to confirmation sequences is given for lines that have been confirmed for an insertion and are available from NASC.

The "Search for genome range" will list all GABI-Kat hits in the defined genome region. The input values should be pseudochromosome positions from TAIRv10. 

With the sequence-based search all FST sequences included in the database can be detected. Nucleotide sequence or protein sequence are accepted. You can limit the sequence divergence between input sequence and FST by decreasing the expect value.

The gene hit search result page shows the corresponding GABI-Kat line ID with a link to the FST sequence, the AGI gene code, and a link to graphic view of the gene and FST.

The graphic view page displays a genome fragment around the gene or FST as an image, with all the genes and FSTs in this region. The FSTs and genes are clickable and the image can be zoomed out one time. The FSTs are distinguished using 3 different colors:

test Confirmed: The insertion is confirmed and has been donated to NASC.

 

Unknown: Work on this insertion has not been done yet, the insertion can be ordered.

 

Failed: The insertion could not be confimed.

 

  

Possible error messages from the web order form

 

Error messages like this:
- Firstname is empty or has invalid characters.
- E-mail address is empty or invalid.
- Valid input for institution is required.
... indicate that you need to add input or to correct the term you put into the respective field.  

An error messages like this:
- VAT number is required for the selected country.
... indicates that you selected an EU country but did not provide the VAT ID of your institute or institution. 

An error messages like this:
- VAT number is invalid, a valid VAT number looks like FR1234567890.
... indicates that you provided an invalid VAT ID. 

Other, hopefully self explanatory error messages are:
- No gene or BAC code for line ID 
<lineid> entered.
- No line ID for gene or BAC code 
<gene/bac-code> entered. 
- Line ID <lineid> has multiple genecodes. Please use one of them instead of the BAC-ID.
- Line ID <lineid> has been recently donated to NASC, but we have not received a NASC ID yet. Please order from NASC when the line becomes available from them.
- Line ID <lineid> is available from NASC (<nascid>), please order it from NASC.
- Line ID and gene or BAC code <lineid - gene/bac-code> don't match (the line has no insertion at that locus).
- Line ID <lineid> died - no seeds available.
- Confirmation of the insertion in <lineid - gene/bac-code> failed, it can not be ordered.

 

 

Overview about the structure of SimpleSearch website (dynamic part)

 

The following picture gives an overview of how the different pages of SimpleSearch can be accessed.

 

 

 

How to link SimpleSearch pages from your website

 

If you want to link information from SimpleSearch on your website you can use the following links to access the data:

- Lines:
http://www.gabi-kat.de/db/showseq.php?line=[lineid]
where [lineid] is the line that you are interested in.

Example: http://www.gabi-kat.de/db/showseq.php?line=122F10


- Confirmation-Sequences:
http://www.gabi-kat.de/db/getseq.php?filename=[sequenceid]
where [sequenceid] is the sequence name from the report sheet.

Example: http://www.gabi-kat.de/db/getseq.php?filename=19-K018911-022-250-D07C-DL91

 


- Segregation Results:
http://www.gabi-kat.de/db/show_segre.php?plantid=[lineid]
where [lineid] is the line that you are interested in.

Example: http://www.gabi-kat.de/db/show_segre.php?plantid=250D07

- FSTs:
http://www.gabi-kat.de/db/showseq.php?line=[lineid]&gene=[genecode]
where [lineid] is the line that you are interested in and [genecode] specifies the insertion.

Example: http://www.gabi-kat.de/db/showseq.php?line=250D07&gene=At3g18040


- Graphic View:
http://www.gabi-kat.de/db/picture.php?genecode=[genecode]
where [genecode] is the Genecode you are interested in.

Example: http://www.gabi-kat.de/db/picture.php?genecode=At1g74890

 

 

What is FASTA format

 

>name of the query sequence
TTCTAGGGGTTCTCTCAAATCTGCTCTTCAACCATGGCGGACGAATCTCAATACTCATCGGATACTTACTCCAACAAACG
CAAATACGAAGAACCAACCGCTCCTCCTCCATCAACTCGCAGACCTACCGGCTTCTCTTCTGGTCCGATCCCATCTGCTT
CAGTTGATCCCACCGCACCTACCGGTCTTCCACCTTCTTCTTACAACAGCGTTCCTCCTCCGATGGATGAAATCCAGATT
GCTAAACAAAAAGCACAAGAAATCGCTGCTCGTCTTCTTAATAGCGCTGATGCTAAACGTCCTCGTGTTGACAATGGTGC
TTCTTATGATTATGGTGACAACAAAGGATTTAGCTCATATCCCTCTGGTTCGTTCTTTAAAATCTCTTTTAACTTCTTTT
GTTTATGGAATTTACGGTTTGGAATTGAAAACTTACTGATTGTGATTTGATCTTGATTTAGAGGGTAAGCAGATGTC


The nucleic acid codes supported are:

        
        A --> adenosine           M --> A C (amino)
        C --> cytidine            S --> G C (strong)
        G --> guanine             W --> A T (weak)
        T --> thymidine           B --> G T C
        U --> uridine             D --> G A T
        R --> G A (purine)        H --> A C T
        Y --> T C (pyrimidine)    V --> G C A
        K --> G T (keto)          N --> A G C T (any)
                                  -  gap of indeterminate length


The accepted amino acid codes are:

 
    A  alanine                         P  proline
    B  aspartate or asparagine         Q  glutamine
    C  cystine                         R  arginine
    D  aspartate                       S  serine
    E  glutamate                       T  threonine
    F  phenylalanine                   U  selenocysteine
    G  glycine                         V  valine
    H  histidine                       W  tryptophan
    I  isoleucine                      Y  tyrosine
    K  lysine                          Z  glutamate or glutamine
    L  leucine                         X  any
    M  methionine                      *  translation stop
    N  asparagine                      -  gap of indeterminate length