이름에서 똭 느껴지지 않으십니까?
Ribosomal Database Project
일반 fasta/gb 자료가 있고
trainset 데이터가 있습니다.
Bacteria unalign seq (fa,gb)
Archaea unalign seq (fa,gb)
Train Set 은 여기서 받으시면됩니다.
근데 RDP 같은 경우 사실 그냥 별 이유 없이 받아보는 겁니다. ㅋ
-저도 이게 어떻게 쓰일지 잘 모르겠어요 :)
출처 JYP |
출처: JYP |
$wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/fasta/nr.gz
출처: SM Town |
출처 JYP |
$ wget ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/assembly_summary.txt
$ awk '{FS="\t"} !/^#/{print $20}' assembly_summary.txt > bacteria.list그럼 bacteria.list에 ftp주소가 모입니다.
for M in `cat bacteria.txt`
do
wget -P rna $M/*rna_from_genomic.fna.gz
done
#날씨도좋은데놀아보자 출처: YOUTUBE 캡쳐 |
#물론_이사진을보면_누군지_압니다, 출처:SM Town |
Entrez Database | UID common name | E-utility Database Name |
---|---|---|
BioProject | BioProject ID | bioproject |
BioSample | BioSample ID | biosample |
Biosystems | BSID | biosystems |
Books | Book ID | books |
Conserved Domains | PSSM-ID | cdd |
dbGaP | dbGaP ID | gap |
dbVar | dbVar ID | dbvar |
Epigenomics | Epigenomics ID | epigenomics |
EST | GI number | nucest |
Gene | Gene ID | gene |
Genome | Genome ID | genome |
GEO Datasets | GDS ID | gds |
GEO Profiles | GEO ID | geoprofiles |
GSS | GI number | nucgss |
HomoloGene | HomoloGene ID | homologene |
MeSH | MeSH ID | mesh |
NCBI C++ Toolkit | Toolkit ID | toolkit |
NCBI Web Site | Web Site ID | ncbisearch |
NLM Catalog | NLM Catalog ID | nlmcatalog |
Nucleotide | GI number | nuccore |
OMIA | OMIA ID | omia |
PopSet | PopSet ID | popset |
Probe | Probe ID | probe |
Protein | GI number | protein |
Protein Clusters | Protein Cluster ID | proteinclusters |
PubChem BioAssay | AID | pcassay |
PubChem Compound | CID | pccompound |
PubChem Substance | SID | pcsubstance |
PubMed | PMID | pubmed |
PubMed Central | PMCID | pmc |
SNP | rs number | snp |
SRA | SRA ID | sra |
Structure | MMDB-ID | structure |
Taxonomy | TaxID | taxonomy |
UniGene | UniGene Cluster ID | unigene |
UniSTS | STS ID | unists |