机构地区: 中南大学湘雅医学院肿瘤研究所
出 处: 《生物化学与生物物理进展》 2003年第6期930-934,共5页
摘 要: 从UniGene库中选取编号为BG2 2 2 62 4来自人鼻咽组织的表达序列标签 (EST )序列 ,联网到NCBI调用Blast服务器分析 ,发现该EST序列是一个代表新基因的未知序列 .利用Blast检索GenBank的nr数据库和EST数据库 ,构建EST重叠群 ,联网到NCBI的ORFfinder服务器 ,分析发现该EST重叠群具有完整的阅读框架 .分别在cDNA序列阅读框架的起始密码子和终止密码子的两侧设计引物 ,以人胎脑cDNA文库为模板 ,进行PCR扩增 ,测序确定该基因的cDNA全长序列 .该基因cDNA序列全长为 1672bp ,阅读框架位于第 3 0 4~ 1557位之间 ,编码由 417个氨基酸组成 ,分子质量为 46 58ku的蛋白质 ,其理论 pI为 4 2 1.将蛋白质序列通过NCBI的Blast服务器进行序列相似性分析 ,发现该基因编码的蛋白质和成年小鼠视网膜未知蛋白 (BAB3 2 2 14 )同源 .经与国际人类基因组命名委员会协商定名为成人视网膜假定蛋白 (adultretinahypotheticalprotein ,ARHP) ,GenBank登录号为AY174896.生物信息学分析表明 ,该蛋白质可能为一参与转录调控的核蛋白 .ARHP基因定位在染色体 5q3 5,跨越 3 5163bp ,含 4个外显子和 3个内含子 .在基因的 5′非翻译区有 BLAST analysis suggested that a cDNA fragment (GenBank accessi on number BG222624) derived from human nasopharynx might represent a novel human gene. Applying the bioinformatics and experimental technique, a novel human gene have been cloned from the fetal brain cDNA library. Since this fragment contained a complete open reading frame(ORF) of 1 254 bp with a stop codon in its upstream and poly(A) signal in its downstream, it could be concluded that it is a full-length gene (GenBank accession number AY174896), which was named as adult retina hypothetical protein(ARHP). The full-length cDNA of ARHP gene is 1 672 bp, coding for a 417 amino acids polypeptide with a predicted molecular mass of 46 58 ku and isoelectric point of 4 20. The deduced amino acid showed 70% homology with a M.musculus protein BAB32214. Bioinformatics analysis suggested that the protein may be a nucleus protein regulating gene transcription. The new gene is comprised of four exons, with three intervening introns and it is localized to chromosome 5q35. It have been found that there are two CpG islands in 5′ UTR of this gene.
关 键 词: 成人视网膜假定蛋白基因 基因克隆 生物信息学分析 表达序列标签