コーパス検索結果 (left1)
通し番号をクリックするとPubMedの該当ページを表示します
1 Pfam and corresponding DPAM-AI domains are at
2 Pfam clans are described in detail, together with the ne
3 Pfam contains multiple alignments and hidden Markov mode
4 Pfam homology and domain boundary annotations in the tar
5 Pfam is a collection of multiple alignments and profile
6 Pfam is a comprehensive collection of protein domains an
7 Pfam is a database of protein families that currently co
8 Pfam is a large collection of protein domains and famili
9 Pfam is a large collection of protein families and domai
10 Pfam is a large collection of protein multiple sequence
11 Pfam is a widely used database of protein families and d
12 Pfam is a widely used database of protein families, curr
13 Pfam is available on the web in the UK, the USA, France
14 Pfam is available via servers in the UK, the USA and Swe
15 Pfam is freely available for browsing and download at
16 Pfam is now based not only on the UniProtKB sequence dat
17 Pfam is possibly the most well known protein family data
18 Pfam protein domains are often thought of as evolutionar
19 Pfam provides similar coverage of ECOD with family class
20 Pfam release 24.0 contains 11,912 families, of which a l
21 Pfam search results have been calculated for the entire
22 Pfam term enrichment analysis revealed 172 protein famil
23 Pfam, available via servers in the UK and the USA, is a
24 Pfam-B, the automatically-generated supplement to Pfam,
25 systematic large-scale study of nearly 2,000 Pfam protein families with sufficient sequence informati
30 39), DUF399 (domain of unknown function 399; Pfam ID: PF04187) and MARTX toxins that contribute to ho
31 unique domain-domain interactions among 4036 Pfam domains, out of which 4349 are inferred from PDB en
32 The pipeline is extended to a set of 417 Pfam families, built on the combination of Tara with oth
34 as diseasesusceptible and 32 proteins and 67 Pfam families (10,783 domains) as diseaseresistant based
37 F0-ATPase-regulatory proteins representing a Pfam protein family of 246 sequences from 219 species (P
38 ts: A neutral nucleotide model compared to a Pfam domain encoding model (PSILC(nuc/dom)); A protein c
43 he protein families data base of alignments (Pfam) analysis suggested the wit3.0 peptide sequence sha
44 e three-dimensional structure models for all Pfam-A sequence families with average length under 150 r
46 transcription factor-family curation of all Pfam domains, incorporated the Gene Ontology classificat
47 nabled the linking between authorship of all Pfam entries with the corresponding authors' ORCID ident
49 modelling of discontinuous domains allowing Pfam domain definitions to be closer to those found in s
50 heir UniProt BLASTX hits, GO annotation, and Pfam analysis results, are freely accessible as a public
51 to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the li
52 B database are, however, still available and Pfam annotations for individual UniProtKB sequences can
53 Two annotation schemes, i.e. MapMan BIN and Pfam, at two sparsity thresholds, i.e. top 100 (stringen
54 sequence-only (Superfamily, PDBAA BLAST and Pfam) and sequence-structure-based (SAM-T02, 3D-PSSM, mG
55 yme (carbohydrate active enzyme) classes and Pfam clans, which attested its usefulness in the phyloge
56 rotein families remain to be classified, and Pfam continues working toward comprehensive coverage of
57 ies directly using RPS-BLAST against COG and Pfam databases and indirectly via proxygenes that are id
59 esentatives of the exhaustive databases, and Pfam-A and Superfamily as databases that predefine famil
60 domain classifications such as InterPro and Pfam, and other ontologies such as mammalian phenotype a
62 n information from Superfamily, InterPro and Pfam; three-dimensional structures at the Protein Data B
67 me structures as described by CATH, SCOP and Pfam, and is available as an interactive website or a fl
68 us sequence databases, domains from SCOP and Pfam, patterns from Prosite and other predicted sequence
71 egions, more than 80% overlap with annotated Pfam domains, including all of the 15 known drug targets
74 ion of the sequence family databases such as Pfam and Interpro with the structure-oriented databases
75 rowsed using classification systems, such as Pfam, Gene Ontology annotation, mpstruc or the Transport
76 rences to other biological resources such as Pfam, SCOP, CATH, GO, InterPro and the NCBI taxonomy dat
80 By bridging the gap between sequence-based Pfam and structure-based ECOD domain classifications, ou
82 ysis to other databases: Reactome, BioCarta, Pfam, PID and SMART, finding additional hits in ErbB and
84 teins with at least one domain classified by Pfam as belonging to the Pseudouridine synthase and Arch
85 ve procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alig
87 r with associated data including SCOP, CATH, Pfam, SWISSPROT, InterPro, GO terms, Protein Quaternary
89 h alignment data from the public collections Pfam and SMART, as well as with contributions from colle
91 These results show that genes containing Pfam domains associated with duplication resistance in A
92 rches against a reference library containing Pfam-annotated UniProt sequences and random synthetic se
93 cid sequences that matches the corresponding Pfam family seed alignment, an alignment of DNA sequence
95 he structure's key reference, citation data, Pfam domain diagrams, topology diagrams and protein-prot
96 Each report also contains expression data, Pfam domain information and an associated Mouse Mutant P
99 e method uses the Protein Families database (Pfam) and motif finding algorithms to identify oligonucl
100 f HMMs is taken from two existing databases (Pfam and SUPERFAMILY), and is limited to models that exc
105 ble protein family databases (Blocks + DOMO, Pfam, PIR-ALN, PRINTS, PROSITE, ProDom, PROTOMAP, SBASE,
106 vel highly conserved protein domain, DUF162 [Pfam: PF02589], can be mapped to two proteins: LutB and
110 proteins, but also in erythromycin esterase (Pfam ID: PF05139), DUF399 (domain of unknown function 39
111 ther domain enrichment approaches exploiting Pfam families, but benefits from more functionally coher
112 open reading frame, kinase) protein family (Pfam 00480) is a large collection of bacterial polypepti
113 embers of an uncharacterized protein family (Pfam PF08000), which provide compelling evidence for the
115 (max 6.6%) for species, 9.0% (max 28.7%) for Pfam protein domains and 9.4% (max 22.9%) for PANTHER ge
116 stimate that more than 4000 contact maps for Pfam families of unknown structure have more than 50% of
117 m Prints not present in PROSITE, blocks from Pfam-A not present in PROSITE or Prints, and so on for P
118 data from DisProt and independent data from Pfam to validate the above observations that rely on the
119 UniProt sequence database, domain data from Pfam, metabolic pathway and functional data from COGs, K
121 CDD collection contains models imported from Pfam, SMART and COG, as well as domain models curated at
122 and DIP; functional domain information from Pfam; protein fingerprints from PRINTS; protein family a
124 gent query domains, originally selected from Pfam, and full-length proteins containing their homologo
125 ered (overall, 82.9% of sequences taken from Pfam) and the alignment of amino acid sequences restrict
126 s that encodes a domain of unknown function (Pfam: PF10070) and a putative cation transporter (Pfam:
127 protein family (domain of unknown function; Pfam families PF07005 and PF17042) and (ii) discovered n
128 ce polymorphisms) and family resources (e.g. Pfam and eggNog) and displayed on the Gene3D website.
129 ffects accumulation of transcripts for genes/Pfam domains involved in ribosome biogenesis, photosynth
131 f ROK glucokinases and non-ROK glucokinases (Pfam 02685), revealing the primary sequence elements sha
133 sing the NCBI taxonomy database, IntEnz, GO, Pfam, InterPro, SCOP, CATH, PubMed, Ensembl, Homologene
134 163 OSC genes were investigated to identify Pfam domains significantly enriched in these regions.
136 ltritol to d-tagatose via a dehydrogenase in Pfam family PF00107, a previously unknown reaction; 2) p
138 r the past 2 years the number of families in Pfam has doubled and now stands at 6190 (version 10.0).
139 ions: (i) for all protein domain families in Pfam, the fixation of genes horizontally transferred is
140 integral membrane proteins, cereal genes in Pfam family PF02458 emerged as candidates for the ferulo
141 se to d-tagatose 6-phosphate via a kinase in Pfam family PF00294, a previously orphan EC number; and
142 uently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, lead
144 e have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now reso
145 domain boundaries, their classification into Pfam clans, as well as their functional annotation.
146 es: in particular, about 81% of medium-large Pfam families and 72% of ECOD families can be mapped to
148 arge protein families (including the largest Pfam alignment containing 27000 HIV GP120 glycoprotein s
149 ilies to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouragin
152 o the frequency of occurrence in the modeled Pfam families, suggesting the significant role of the Ta
154 using protein regions that match two or more Pfam families not currently annotated as related in Pfam
156 shold, (iii) recognition of FDRs in Multiple Pfam enzyme families, and (iv) recognition of multiple P
157 sters constituted of protein regions with no Pfam annotation, which are therefore candidates for repr
158 n uncovered a considerable fraction (15%) of Pfam domains containing multiple structural and evolutio
161 antial phylogenetic separations (1.1-9.7% of Pfam families surveyed at three taxonomic ranges studied
166 lent domains, and conversely the majority of Pfam domains sampled by our data play no currently estab
167 -4 to d-fructose 6-phosphate via a member of Pfam family PF08013, another previously unknown reaction
170 parsimony approach to compare repertoires of Pfam domains and their combinations, we show that indepe
171 In contrast, in a non-redundant sample of Pfam-AB, only 1% of four-amino acid word clumps (4.7% of
172 range of Pfam web tools and the first set of Pfam web services that allow programmatic access to the
173 represents the first eukaryotic structure of Pfam family PF03937 and reveals a conserved surface regi
176 ade using a new algorithm based primarily on Pfam domain occurrence patterns in mitochondrial and non
177 ene Ontology Annotation projects; updates on Pfam, SMART and InterPro domain databases; update papers
178 arch options allow search by UniProt code or Pfam domain identifier, and results can be filtered by d
180 and RNA polymerase I, as well as many other Pfam families that had not previously been classified.
182 , diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY
183 sequences of protein family domains (Pfams), Pfam functions and clan information, we develop a deep l
184 nd Genomes (KEGG) annotations, and potential Pfam domains were assigned to each transcript isoform.
189 ignature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D and PA
190 oteins was predicted with pattern profilers (Pfam, Prosite, TMHMM, and pSORT), and by examining queri
191 ent; and integration with PROSITE, profiles, Pfam and ProDom, as part of the international InterPro p
195 ported as being homologues of TraB proteins (Pfam ID: PF01963), a widely distributed family of unknow
197 o expert annotations of domain-like regions (Pfam-A) and completing through cuts based on termini of
200 ures are the first examples of the Rep_trans Pfam family providing insights into the replication of n
201 tion features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of
204 e frequently found to consist of known short Pfam domains, e.g., leucine-rich repeats, tetratricopept
207 for the presence of protease cleavage sites, Pfam domains, glycosylation sites, signal peptides, tran
208 domain alignment collections, such as SMART, Pfam and COG, we have continued an effort to update, and
211 istory of each seed sequence in the spurious Pfam protein family (PF10695, 'Cw-hydrolase') uncovered
212 ences, but for proteins with known structure Pfam matches 95%, which we believe represents the likely
214 esponding three-dimensional (3D) structures, Pfam domains, and protein-protein interaction interfaces
215 available, have been utilised to ensure that Pfam families correspond with structural domains, and to
216 he most significant of these changes is that Pfam is now primarily based on the UniProtKB reference p
217 nce databases the fraction of sequences that Pfam matches is reduced, suggesting that continued addit
222 ting Intolerant from Tolerant (SIFT) and the Pfam-based LogR.E-value method, we have identified featu
225 method precludes definitive conclusions, the Pfam models provide the only tertiary structure informat
226 s of a nonredundant amino acid database, the Pfam domain database, plant Expressed Sequence Tags, and
227 relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand tho
229 iple sequence alignment as obtained from the Pfam database (a database of protein families and conser
235 , using 1750 gene trees constructed from the Pfam protein family database, that it appears to be a pr
237 in the database by adding families from the Pfam-A, ProDom and Domo databases to those from PROSITE
239 the domain of unknown function DUF28 in the Pfam and PALI databases for which there was no structura
240 enome that contain a domain described in the Pfam database as domain of unknown function 579 (DUF579)
241 known relationships between families in the Pfam database as well as detect novel distant relationsh
242 JIP60 are conserved in 815 plant RIPs in the Pfam database that were identified by HUMMR as containin
244 manually curated inclusion thresholds in the Pfam database, especially on the subset of families that
248 ition-specific scoring functions used in the Pfam models, the score statistics of hybrid alignment ob
249 6 models of protein domains contained in the Pfam v5.4 database verifies the theoretical predictions:
250 nces that contain the coding sequence of the Pfam alignment when they can be recovered (overall, 82.9
251 s in close correspondence to the ones of the Pfam and ECOD resources: in particular, about 81% of med
252 nces derived from the seed alignments of the Pfam database of amino acid alignments of families of ho
254 s intended for genome-scale searching of the Pfam database without having to install this database an
258 The new method, MITOPRED, is based on the Pfam domain occurrence patterns and the amino acid compo
260 Methodology improvements for searching the Pfam collection locally as well as via the web are descr
262 ion was compared to a proteome data set, the Pfam domain database, and the genomes of six other fungi
269 ead relying on direct collaboration with the Pfam sequence family database to inform our classificati
270 t of bioactivities stored in ChEMBL with the Pfam-A domain most likely to mediate small molecule bind
271 n interaction data was integrated within the Pfam database and website, but it has now been migrated
279 oups) and KO (KEGG Orthology) in addition to Pfam domains; (iii) information on intronless genes are
284 ower fold, added several protein families to Pfam database(2) and experimentally demonstrated that on
285 Small molecule binding has been mapped to Pfam-A domains of protein targets in the ChEMBL bioactiv
286 lust sequences are annotated with matches to Pfam, SCOP domains, and proteins in the PDB, using our H
289 n illustration, trRosetta was applied to two Pfam families with unknown structures, for which the pre
291 than a thousand structurally uncharacterized Pfam families to achieve reasonable structural annotatio
293 updated version to 26,219 among 5140 unique Pfam domains, a 23% increase compared to 20,513 unique D
294 of a large set of gene families with unknown Pfam domains and a number of species or desert-truffle-s
295 kground information, domain architecture via Pfam links, a list of all sequences and an assessment of
296 o 6000 amino acids with AlphaFold, visualize Pfam annotations, and dock ligands with AutoDock Vina an