fly005304
2008-07-03, 17:10
要写一个程序可以把数据库中的"AC"那一行读出和在"CC"带有" -!- SUBCELLULAR LOCATION"的那行数据也读出,并把"AC"那一行读出的文件和"CC"带有-!- SUBCELLULAR LOCATION也保存到一个文件里!
数据库的形式如下:
ID 104K_THEAN Reviewed; 893 AA.
AC Q4U9M9;
DT 18-APR-2006, integrated into UniProtKB/Swiss-Prot.
DT 05-JUL-2005, sequence version 1.
DT 23-OCT-2007, entry version 14.
DE 104 kDa microneme/rhoptry antigen precursor (p104).
GN ORFNames=TA08425;
OS Theileria annulata.
OC Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Piroplasmida;
OC Theileriidae; Theileria.
OX NCBI_TaxID=5874;
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Ankara;
RX PubMed=15994557; DOI=10.1126/science.1110418;
RA Pain A., Renauld H., Berriman M., Murphy L., Yeats C.A., Weir W.,
RA Kerhornou A., Aslett M., Bishop R., Bouchier C., Cochet M.,
RA Coulson R.M.R., Cronin A., de Villiers E.P., Fraser A., Fosker N.,
RA Gardner M., Goble A., Griffiths-Jones S., Harris D.E., Katzer F.,
RA Larke N., Lord A., Maser P., McKellar S., Mooney P., Morton F.,
RA Nene V., O'Neil S., Price C., Quail M.A., Rabbinowitsch E.,
RA Rawlings N.D., Rutter S., Saunders D., Seeger K., Shah T., Squares R.,
RA Squares S., Tivey A., Walker A.R., Woodward J., Dobbelaere D.A.E.,
RA Langsley G., Rajandream M.A., McKeever D., Shiels B., Tait A.,
RA Barrell B.G., Hall N.;
RT "Genome of the host-cell transforming parasite Theileria annulata
RT compared with T. parva.";
RL Science 309:131-133(2005).
CC -!- SUBCELLULAR LOCATION: Cell membrane; Lipid-anchor, GPI-anchor
CC (Potential). Note=In microneme/rhoptry complexes (By similarity).
CC -----------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution-NoDerivs License
CC -----------------------------------------------------------------------
DR EMBL; CR940353; CAI76474.1; -; Genomic_DNA.
DR KEGG; tan:TA08425; -.
DR InterPro; IPR007480; DUF529.
DR Pfam; PF04385; FAINT; 4.
PE 3: Inferred from homology;
KW Complete proteome; Glycoprotein; GPI-anchor; Lipoprotein; Membrane;
KW Repeat; Signal; Sporozoite.
FT SIGNAL 1 19 Potential.
FT CHAIN 20 873 104 kDa microneme/rhoptry antigen.
FT /FTId=PRO_0000232680.
FT PROPEP 874 893 Removed in mature form (Potential).
FT /FTId=PRO_0000232681.
FT COMPBIAS 215 220 Poly-Leu.
FT COMPBIAS 486 683 Lys-rich.
FT COMPBIAS 854 859 Poly-Arg.
FT LIPID 873 873 GPI-anchor amidated aspartate
FT (Potential).
SQ SEQUENCE 893 AA; 101921 MW; 2F67CEB3B02E7AC1 CRC64;
MKFLVLLFNI LCLFPILGAD ELVMSPIPTT DVQPKVTFDI NSEVSSGPLY LNPVEMAGVK
YLQLQRQPGV QVHKVVEGDI VIWENEEMPL YTCAIVTQNE VPYMAYVELL EDPDLIFFLK
EGDQWAPIPE DQYLARLQQL RQQIHTESFF SLNLSFQHEN YKYEMVSSFQ HSIKMVVFTP
KNGHICKMVY DKNIRIFKAL YNEYVTSVIG FFRGLKLLLL NIFVIDDRGM IGNKYFQLLD
DKYAPISVQG YVATIPKLKD FAEPYHPIIL DISDIDYVNF YLGDATYHDP GFKIVPKTPQ
CITKVVDGNE VIYESSNPSV ECVYKVTYYD KKNESMLRLD LNHSPPSYTS YYAKREGVWV
TSTYIDLEEK IEELQDHRST ELDVMFMSDK DLNVVPLTNG NLEYFMVTPK PHRDIIIVFD
GSEVLWYYEG LENHLVCTWI YVTEGAPRLV HLRVKDRIPQ NTDIYMVKFG EYWVRISKTQ
YTQEIKKLIK KSKKKLPSIE EEDSDKHGGP PKGPEPPTGP GHSSSESKEH EDSKESKEPK
EHGSPKETKE GEVTKKPGPA KEHKPSKIPV YTKRPEFPKK SKSPKRPESP KSPKRPVSPQ
RPVSPKSPKR PESLDIPKSP KRPESPKSPK RPVSPQRPVS PRRPESPKSP KSPKSPKSPK
VPFDPKFKEK LYDSYLDKAA KTKETVTLPP VLPTDESFTH TPIGEPTAEQ PDDIEPIEES
VFIKETGILT EEVKTEDIHS ETGEPEEPKR PDSPTKHSPK PTGTHPSMPK KRRRSDGLAL
STTDLESEAG RILRDPTGKI VTMKRSKSFD DLTTVREKEH MGAEIRKIVV DDDGTEADDE
DTHPSKEKHL STVRRRRPRP KKSSKSSKPR KPDSAFVPSI IFIFLVSLIV GIL
//
ID 104K_THEPA Reviewed; 924 AA.
AC P15711; Q4N2B5;
DT 01-APR-1990, integrated into UniProtKB/Swiss-Prot.
DT 01-APR-1990, sequence version 1.
DT 23-OCT-2007, entry version 38.
DE 104 kDa microneme/rhoptry antigen precursor (p104).
GN OrderedLocusNames=TP04_0437;
OS Theileria parva.
OC Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Piroplasmida;
OC Theileriidae; Theileria.
OX NCBI_TaxID=5875;
RN [1]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RC STRAIN=Muguga;
RX MEDLINE=90158697; PubMed=1689460; DOI=10.1016/0166-6851(90)90007-9;
RA Iams K.P., Young J.R., Nene V., Desai J., Webster P., Ole-Moiyoi O.K.,
RA Musoke A.J.;
RT "Characterisation of the gene encoding a 104-kilodalton microneme-
RT rhoptry protein of Theileria parva.";
RL Mol. Biochem. Parasitol. 39:47-60(1990).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Muguga;
RX PubMed=15994558; DOI=10.1126/science.1110439;
RA Gardner M.J., Bishop R., Shah T., de Villiers E.P., Carlton J.M.,
RA Hall N., Ren Q., Paulsen I.T., Pain A., Berriman M., Wilson R.J.M.,
RA Sato S., Ralph S.A., Mann D.J., Xiong Z., Shallom S.J., Weidman J.,
RA Jiang L., Lynn J., Weaver B., Shoaibi A., Domingo A.R., Wasawo D.,
RA Crabtree J., Wortman J.R., Haas B., Angiuoli S.V., Creasy T.H., Lu C.,
RA Suh B., Silva J.C., Utterback T.R., Feldblyum T.V., Pertea M.,
RA Allen J., Nierman W.C., Taracha E.L.N., Salzberg S.L., White O.R.,
RA Fitzhugh H.A., Morzaria S., Venter J.C., Fraser C.M., Nene V.;
RT "Genome sequence of Theileria parva, a bovine pathogen that transforms
RT lymphocytes.";
RL Science 309:134-137(2005).
CC -!- SUBCELLULAR LOCATION: Cell membrane; Lipid-anchor, GPI-anchor
CC (Potential). Note=In microneme/rhoptry complexes.
CC -!- DEVELOPMENTAL STAGE: Sporozoite antigen.
CC -----------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution-NoDerivs License
CC -----------------------------------------------------------------------
DR EMBL; M29954; AAA18217.1; -; Unassigned_DNA.
DR EMBL; AAGK01000004; EAN31789.1; -; Genomic_DNA.
DR PIR; A44945; A44945.
DR KEGG; tpv:TP04_0437; -.
DR InterPro; IPR007480; DUF529.
DR Pfam; PF04385; FAINT; 4.
PE 2: Evidence at transcript level;
KW Complete proteome; Glycoprotein; GPI-anchor; Lipoprotein; Membrane;
KW Repeat; Signal; Sporozoite.
FT SIGNAL 1 19 Potential.
FT CHAIN 20 904 104 kDa microneme/rhoptry antigen.
FT /FTId=PRO_0000046081.
FT PROPEP 905 924 Removed in mature form (Potential).
FT /FTId=PRO_0000232679.
FT COMPBIAS 508 753 Pro-rich.
FT COMPBIAS 880 883 Poly-Arg.
FT LIPID 904 904 GPI-anchor amidated aspartate
FT (Potential).
SQ SEQUENCE 924 AA; 103626 MW; 289B4B554A61870E CRC64;
MKFLILLFNI LCLFPVLAAD NHGVGPQGAS GVDPITFDIN SNQTGPAFLT AVEMAGVKYL
QVQHGSNVNI HRLVEGNVVI WENASTPLYT GAIVTNNDGP YMAYVEVLGD PNLQFFIKSG
DAWVTLSEHE YLAKLQEIRQ AVHIESVFSL NMAFQLENNK YEVETHAKNG ANMVTFIPRN
GHICKMVYHK NVRIYKATGN DTVTSVVGFF RGLRLLLINV FSIDDNGMMS NRYFQHVDDK
YVPISQKNYE TGIVKLKDYK HAYHPVDLDI KDIDYTMFHL ADATYHEPCF KIIPNTGFCI
TKLFDGDQVL YESFNPLIHC INEVHIYDRN NGSIICLHLN YSPPSYKAYL VLKDTGWEAT
THPLLEEKIE ELQDQRACEL DVNFISDKDL YVAALTNADL NYTMVTPRPH RDVIRVSDGS
EVLWYYEGLD NFLVCAWIYV SDGVASLVHL RIKDRIPANN DIYVLKGDLY WTRITKIQFT
QEIKRLVKKS KKKLAPITEE DSDKHDEPPE GPGASGLPPK APGDKEGSEG HKGPSKGSDS
SKEGKKPGSG KKPGPAREHK PSKIPTLSKK PSGPKDPKHP RDPKEPRKSK SPRTASPTRR
PSPKLPQLSK LPKSTSPRSP PPPTRPSSPE RPEGTKIIKT SKPPSPKPPF DPSFKEKFYD
DYSKAASRSK ETKTTVVLDE SFESILKETL PETPGTPFTT PRPVPPKRPR TPESPFEPPK
DPDSPSTSPS EFFTPPESKR TRFHETPADT PLPDVTAELF KEPDVTAETK SPDEAMKRPR
SPSEYEDTSP GDYPSLPMKR HRLERLRLTT TEMETDPGRM AKDASGKPVK LKRSKSFDDL
TTVELAPEPK ASRIVVDDEG TEADDEETHP PEERQKTEVR RRRPPKKPSK SPRPSKPKKP
KKPDSAYIPS ILAILVVSLI VGIL
//
数据库的形式如下:
ID 104K_THEAN Reviewed; 893 AA.
AC Q4U9M9;
DT 18-APR-2006, integrated into UniProtKB/Swiss-Prot.
DT 05-JUL-2005, sequence version 1.
DT 23-OCT-2007, entry version 14.
DE 104 kDa microneme/rhoptry antigen precursor (p104).
GN ORFNames=TA08425;
OS Theileria annulata.
OC Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Piroplasmida;
OC Theileriidae; Theileria.
OX NCBI_TaxID=5874;
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Ankara;
RX PubMed=15994557; DOI=10.1126/science.1110418;
RA Pain A., Renauld H., Berriman M., Murphy L., Yeats C.A., Weir W.,
RA Kerhornou A., Aslett M., Bishop R., Bouchier C., Cochet M.,
RA Coulson R.M.R., Cronin A., de Villiers E.P., Fraser A., Fosker N.,
RA Gardner M., Goble A., Griffiths-Jones S., Harris D.E., Katzer F.,
RA Larke N., Lord A., Maser P., McKellar S., Mooney P., Morton F.,
RA Nene V., O'Neil S., Price C., Quail M.A., Rabbinowitsch E.,
RA Rawlings N.D., Rutter S., Saunders D., Seeger K., Shah T., Squares R.,
RA Squares S., Tivey A., Walker A.R., Woodward J., Dobbelaere D.A.E.,
RA Langsley G., Rajandream M.A., McKeever D., Shiels B., Tait A.,
RA Barrell B.G., Hall N.;
RT "Genome of the host-cell transforming parasite Theileria annulata
RT compared with T. parva.";
RL Science 309:131-133(2005).
CC -!- SUBCELLULAR LOCATION: Cell membrane; Lipid-anchor, GPI-anchor
CC (Potential). Note=In microneme/rhoptry complexes (By similarity).
CC -----------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution-NoDerivs License
CC -----------------------------------------------------------------------
DR EMBL; CR940353; CAI76474.1; -; Genomic_DNA.
DR KEGG; tan:TA08425; -.
DR InterPro; IPR007480; DUF529.
DR Pfam; PF04385; FAINT; 4.
PE 3: Inferred from homology;
KW Complete proteome; Glycoprotein; GPI-anchor; Lipoprotein; Membrane;
KW Repeat; Signal; Sporozoite.
FT SIGNAL 1 19 Potential.
FT CHAIN 20 873 104 kDa microneme/rhoptry antigen.
FT /FTId=PRO_0000232680.
FT PROPEP 874 893 Removed in mature form (Potential).
FT /FTId=PRO_0000232681.
FT COMPBIAS 215 220 Poly-Leu.
FT COMPBIAS 486 683 Lys-rich.
FT COMPBIAS 854 859 Poly-Arg.
FT LIPID 873 873 GPI-anchor amidated aspartate
FT (Potential).
SQ SEQUENCE 893 AA; 101921 MW; 2F67CEB3B02E7AC1 CRC64;
MKFLVLLFNI LCLFPILGAD ELVMSPIPTT DVQPKVTFDI NSEVSSGPLY LNPVEMAGVK
YLQLQRQPGV QVHKVVEGDI VIWENEEMPL YTCAIVTQNE VPYMAYVELL EDPDLIFFLK
EGDQWAPIPE DQYLARLQQL RQQIHTESFF SLNLSFQHEN YKYEMVSSFQ HSIKMVVFTP
KNGHICKMVY DKNIRIFKAL YNEYVTSVIG FFRGLKLLLL NIFVIDDRGM IGNKYFQLLD
DKYAPISVQG YVATIPKLKD FAEPYHPIIL DISDIDYVNF YLGDATYHDP GFKIVPKTPQ
CITKVVDGNE VIYESSNPSV ECVYKVTYYD KKNESMLRLD LNHSPPSYTS YYAKREGVWV
TSTYIDLEEK IEELQDHRST ELDVMFMSDK DLNVVPLTNG NLEYFMVTPK PHRDIIIVFD
GSEVLWYYEG LENHLVCTWI YVTEGAPRLV HLRVKDRIPQ NTDIYMVKFG EYWVRISKTQ
YTQEIKKLIK KSKKKLPSIE EEDSDKHGGP PKGPEPPTGP GHSSSESKEH EDSKESKEPK
EHGSPKETKE GEVTKKPGPA KEHKPSKIPV YTKRPEFPKK SKSPKRPESP KSPKRPVSPQ
RPVSPKSPKR PESLDIPKSP KRPESPKSPK RPVSPQRPVS PRRPESPKSP KSPKSPKSPK
VPFDPKFKEK LYDSYLDKAA KTKETVTLPP VLPTDESFTH TPIGEPTAEQ PDDIEPIEES
VFIKETGILT EEVKTEDIHS ETGEPEEPKR PDSPTKHSPK PTGTHPSMPK KRRRSDGLAL
STTDLESEAG RILRDPTGKI VTMKRSKSFD DLTTVREKEH MGAEIRKIVV DDDGTEADDE
DTHPSKEKHL STVRRRRPRP KKSSKSSKPR KPDSAFVPSI IFIFLVSLIV GIL
//
ID 104K_THEPA Reviewed; 924 AA.
AC P15711; Q4N2B5;
DT 01-APR-1990, integrated into UniProtKB/Swiss-Prot.
DT 01-APR-1990, sequence version 1.
DT 23-OCT-2007, entry version 38.
DE 104 kDa microneme/rhoptry antigen precursor (p104).
GN OrderedLocusNames=TP04_0437;
OS Theileria parva.
OC Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Piroplasmida;
OC Theileriidae; Theileria.
OX NCBI_TaxID=5875;
RN [1]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RC STRAIN=Muguga;
RX MEDLINE=90158697; PubMed=1689460; DOI=10.1016/0166-6851(90)90007-9;
RA Iams K.P., Young J.R., Nene V., Desai J., Webster P., Ole-Moiyoi O.K.,
RA Musoke A.J.;
RT "Characterisation of the gene encoding a 104-kilodalton microneme-
RT rhoptry protein of Theileria parva.";
RL Mol. Biochem. Parasitol. 39:47-60(1990).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Muguga;
RX PubMed=15994558; DOI=10.1126/science.1110439;
RA Gardner M.J., Bishop R., Shah T., de Villiers E.P., Carlton J.M.,
RA Hall N., Ren Q., Paulsen I.T., Pain A., Berriman M., Wilson R.J.M.,
RA Sato S., Ralph S.A., Mann D.J., Xiong Z., Shallom S.J., Weidman J.,
RA Jiang L., Lynn J., Weaver B., Shoaibi A., Domingo A.R., Wasawo D.,
RA Crabtree J., Wortman J.R., Haas B., Angiuoli S.V., Creasy T.H., Lu C.,
RA Suh B., Silva J.C., Utterback T.R., Feldblyum T.V., Pertea M.,
RA Allen J., Nierman W.C., Taracha E.L.N., Salzberg S.L., White O.R.,
RA Fitzhugh H.A., Morzaria S., Venter J.C., Fraser C.M., Nene V.;
RT "Genome sequence of Theileria parva, a bovine pathogen that transforms
RT lymphocytes.";
RL Science 309:134-137(2005).
CC -!- SUBCELLULAR LOCATION: Cell membrane; Lipid-anchor, GPI-anchor
CC (Potential). Note=In microneme/rhoptry complexes.
CC -!- DEVELOPMENTAL STAGE: Sporozoite antigen.
CC -----------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution-NoDerivs License
CC -----------------------------------------------------------------------
DR EMBL; M29954; AAA18217.1; -; Unassigned_DNA.
DR EMBL; AAGK01000004; EAN31789.1; -; Genomic_DNA.
DR PIR; A44945; A44945.
DR KEGG; tpv:TP04_0437; -.
DR InterPro; IPR007480; DUF529.
DR Pfam; PF04385; FAINT; 4.
PE 2: Evidence at transcript level;
KW Complete proteome; Glycoprotein; GPI-anchor; Lipoprotein; Membrane;
KW Repeat; Signal; Sporozoite.
FT SIGNAL 1 19 Potential.
FT CHAIN 20 904 104 kDa microneme/rhoptry antigen.
FT /FTId=PRO_0000046081.
FT PROPEP 905 924 Removed in mature form (Potential).
FT /FTId=PRO_0000232679.
FT COMPBIAS 508 753 Pro-rich.
FT COMPBIAS 880 883 Poly-Arg.
FT LIPID 904 904 GPI-anchor amidated aspartate
FT (Potential).
SQ SEQUENCE 924 AA; 103626 MW; 289B4B554A61870E CRC64;
MKFLILLFNI LCLFPVLAAD NHGVGPQGAS GVDPITFDIN SNQTGPAFLT AVEMAGVKYL
QVQHGSNVNI HRLVEGNVVI WENASTPLYT GAIVTNNDGP YMAYVEVLGD PNLQFFIKSG
DAWVTLSEHE YLAKLQEIRQ AVHIESVFSL NMAFQLENNK YEVETHAKNG ANMVTFIPRN
GHICKMVYHK NVRIYKATGN DTVTSVVGFF RGLRLLLINV FSIDDNGMMS NRYFQHVDDK
YVPISQKNYE TGIVKLKDYK HAYHPVDLDI KDIDYTMFHL ADATYHEPCF KIIPNTGFCI
TKLFDGDQVL YESFNPLIHC INEVHIYDRN NGSIICLHLN YSPPSYKAYL VLKDTGWEAT
THPLLEEKIE ELQDQRACEL DVNFISDKDL YVAALTNADL NYTMVTPRPH RDVIRVSDGS
EVLWYYEGLD NFLVCAWIYV SDGVASLVHL RIKDRIPANN DIYVLKGDLY WTRITKIQFT
QEIKRLVKKS KKKLAPITEE DSDKHDEPPE GPGASGLPPK APGDKEGSEG HKGPSKGSDS
SKEGKKPGSG KKPGPAREHK PSKIPTLSKK PSGPKDPKHP RDPKEPRKSK SPRTASPTRR
PSPKLPQLSK LPKSTSPRSP PPPTRPSSPE RPEGTKIIKT SKPPSPKPPF DPSFKEKFYD
DYSKAASRSK ETKTTVVLDE SFESILKETL PETPGTPFTT PRPVPPKRPR TPESPFEPPK
DPDSPSTSPS EFFTPPESKR TRFHETPADT PLPDVTAELF KEPDVTAETK SPDEAMKRPR
SPSEYEDTSP GDYPSLPMKR HRLERLRLTT TEMETDPGRM AKDASGKPVK LKRSKSFDDL
TTVELAPEPK ASRIVVDDEG TEADDEETHP PEERQKTEVR RRRPPKKPSK SPRPSKPKKP
KKPDSAYIPS ILAILVVSLI VGIL
//