                         SEQUENCE LISTING

<110>  The Trustees of the University of Pennsylvania
 
<120>  GENE THERAPY FOR MUCOPOLYSACCHARIDOSIS III A

<130>  UPN-18-8476PCT

<150>  US 62/593,081
<151>  2017-11-30

<160>  7     

<170>  PatentIn version 3.5

<210>  1
<211>  1509
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  Engineered nucleic acid sequence encoding human 
       N-sulfoglycosamine sulfohydrolase (hSGSH)

<400>  1
atgagctgcc ctgtgcctgc ctgttgtgcc ctgctgctgg tgctgggact gtgcagagcc       60

agacccagaa acgctctgct gctgctggcc gacgatggcg gctttgagag cggcgcctac      120

aacaacagcg ccattgccac ccctcacctg gacgccctgg ccagaagaag cctgctgttc      180

agaaacgcct tcaccagcgt gtccagctgc agccctagca gagcctctct gctgaccgga      240

ctgcctcagc accagaacgg gatgtacggc ctgcaccagg acgtgcacca cttcaacagc      300

ttcgacaaag tgcggagcct gccactgctg ctgtctcagg ctggcgtgcg gacaggcatc      360

atcggcaaga aacacgtggg ccccgagaca gtgtacccct tcgacttcgc ctacaccgaa      420

gagaacggca gcgtgctgca agtgggccgg aacatcaccc ggatcaaact gctcgtgcgg      480

aagttcctgc agacccagga cgaccggccc ttcttcctgt acgtggcctt ccacgacccc      540

cacagatgtg gccactccca gcctcagtac ggcaccttct gcgagaagtt cggcaacggc      600

gagagcggca tgggcagaat ccctgattgg accccccagg cctacgaccc cctggatgtg      660

ctggtgccct acttcgtgcc caacacccct gccgccagag ccgatctggc cgcccagtat      720

acaaccgtgg gcaggatgga tcagggcgtg ggactggtgc tgcaggaact gagggacgcc      780

ggcgtgctga acgacaccct cgtgatcttt accagcgaca acggcatccc attccccagc      840

ggccggacca atctgtactg gcctggaaca gccgagcccc tgctggtgtc tagccctgag      900

caccctaaga gatggggcca ggtgtccgag gcctacgtgt ccctgctgga tctgaccccc      960

accatcctgg actggttcag catcccctac cccagctacg ccatcttcgg ctccaagacc     1020

atccacctga ccggcagatc tctgctgcct gccctggaag ccgaacctct gtgggccaca     1080

gtgtttggca gccagagcca ccacgaagtg accatgtcct accccatgcg gagcgtgcag     1140

caccggcact tcagactggt gcacaacctg aacttcaaga tgcccttccc aatcgaccag     1200

gacttctatg tgtccccaac cttccaggac ctgctgaaca gaaccacagc cggccagcct     1260

accggctggt acaaggacct gcggcactac tactaccggg ccagatggga gctgtacgac     1320

agaagcaggg acccccacga gacacagaac ctggccaccg accctagatt cgcccagctg     1380

ctggaaatgc tgcgggacca gctggccaag tggcagtggg agacacacga cccttgggtg     1440

tgcgctcctg acggggtgct ggaagagaag ctgagccctc agtgccagcc cctgcacaac     1500

gagctgtga                                                             1509


<210>  2
<211>  502
<212>  PRT
<213>  Homo sapiens

<400>  2

Met Ser Cys Pro Val Pro Ala Cys Cys Ala Leu Leu Leu Val Leu Gly 
1               5                   10                  15      


Leu Cys Arg Ala Arg Pro Arg Asn Ala Leu Leu Leu Leu Ala Asp Asp 
            20                  25                  30          


Gly Gly Phe Glu Ser Gly Ala Tyr Asn Asn Ser Ala Ile Ala Thr Pro 
        35                  40                  45              


His Leu Asp Ala Leu Ala Arg Arg Ser Leu Leu Phe Arg Asn Ala Phe 
    50                  55                  60                  


Thr Ser Val Ser Ser Cys Ser Pro Ser Arg Ala Ser Leu Leu Thr Gly 
65                  70                  75                  80  


Leu Pro Gln His Gln Asn Gly Met Tyr Gly Leu His Gln Asp Val His 
                85                  90                  95      


His Phe Asn Ser Phe Asp Lys Val Arg Ser Leu Pro Leu Leu Leu Ser 
            100                 105                 110         


Gln Ala Gly Val Arg Thr Gly Ile Ile Gly Lys Lys His Val Gly Pro 
        115                 120                 125             


Glu Thr Val Tyr Pro Phe Asp Phe Ala Tyr Thr Glu Glu Asn Gly Ser 
    130                 135                 140                 


Val Leu Gln Val Gly Arg Asn Ile Thr Arg Ile Lys Leu Leu Val Arg 
145                 150                 155                 160 


Lys Phe Leu Gln Thr Gln Asp Asp Arg Pro Phe Phe Leu Tyr Val Ala 
                165                 170                 175     


Phe His Asp Pro His Arg Cys Gly His Ser Gln Pro Gln Tyr Gly Thr 
            180                 185                 190         


Phe Cys Glu Lys Phe Gly Asn Gly Glu Ser Gly Met Gly Arg Ile Pro 
        195                 200                 205             


Asp Trp Thr Pro Gln Ala Tyr Asp Pro Leu Asp Val Leu Val Pro Tyr 
    210                 215                 220                 


Phe Val Pro Asn Thr Pro Ala Ala Arg Ala Asp Leu Ala Ala Gln Tyr 
225                 230                 235                 240 


Thr Thr Val Gly Arg Met Asp Gln Gly Val Gly Leu Val Leu Gln Glu 
                245                 250                 255     


Leu Arg Asp Ala Gly Val Leu Asn Asp Thr Leu Val Ile Phe Thr Ser 
            260                 265                 270         


Asp Asn Gly Ile Pro Phe Pro Ser Gly Arg Thr Asn Leu Tyr Trp Pro 
        275                 280                 285             


Gly Thr Ala Glu Pro Leu Leu Val Ser Ser Pro Glu His Pro Lys Arg 
    290                 295                 300                 


Trp Gly Gln Val Ser Glu Ala Tyr Val Ser Leu Leu Asp Leu Thr Pro 
305                 310                 315                 320 


Thr Ile Leu Asp Trp Phe Ser Ile Pro Tyr Pro Ser Tyr Ala Ile Phe 
                325                 330                 335     


Gly Ser Lys Thr Ile His Leu Thr Gly Arg Ser Leu Leu Pro Ala Leu 
            340                 345                 350         


Glu Ala Glu Pro Leu Trp Ala Thr Val Phe Gly Ser Gln Ser His His 
        355                 360                 365             


Glu Val Thr Met Ser Tyr Pro Met Arg Ser Val Gln His Arg His Phe 
    370                 375                 380                 


Arg Leu Val His Asn Leu Asn Phe Lys Met Pro Phe Pro Ile Asp Gln 
385                 390                 395                 400 


Asp Phe Tyr Val Ser Pro Thr Phe Gln Asp Leu Leu Asn Arg Thr Thr 
                405                 410                 415     


Ala Gly Gln Pro Thr Gly Trp Tyr Lys Asp Leu Arg His Tyr Tyr Tyr 
            420                 425                 430         


Arg Ala Arg Trp Glu Leu Tyr Asp Arg Ser Arg Asp Pro His Glu Thr 
        435                 440                 445             


Gln Asn Leu Ala Thr Asp Pro Arg Phe Ala Gln Leu Leu Glu Met Leu 
    450                 455                 460                 


Arg Asp Gln Leu Ala Lys Trp Gln Trp Glu Thr His Asp Pro Trp Val 
465                 470                 475                 480 


Cys Ala Pro Asp Gly Val Leu Glu Glu Lys Leu Ser Pro Gln Cys Gln 
                485                 490                 495     


Pro Leu His Asn Glu Leu 
            500         


<210>  3
<211>  1509
<212>  DNA
<213>  Homo sapiens

<400>  3
atgagctgcc ccgtgcccgc ctgctgcgcg ctgctgctag tcctggggct ctgccgggcg       60

cgtccccgga acgcactgct gctcctcgcg gatgacggag gctttgagag tggcgcgtac      120

aacaacagcg ccatcgccac cccgcacctg gacgccttgg cccgccgcag cctcctcttt      180

cgcaatgcct tcacctcggt cagcagctgc tctcccagcc gcgccagcct cctcactggc      240

ctgccccagc atcagaatgg gatgtacggg ctgcaccagg acgtgcacca cttcaactcc      300

ttcgacaagg tgcggagcct gccgctgctg ctcagccaag ctggtgtgcg cacaggcatc      360

atcgggaaga agcacgtggg gccggagacc gtgtacccgt ttgactttgc gtacacggag      420

gagaatggct ccgtcctcca ggtggggcgg aacatcacta gaattaagct gctcgtccgg      480

aaattcctgc agactcagga tgaccggcct ttcttcctct acgtcgcctt ccacgacccc      540

caccgctgtg ggcactccca gccccagtac ggaaccttct gtgagaagtt tggcaacgga      600

gagagcggca tgggtcgtat cccagactgg accccccagg cctacgaccc actggacgtg      660

ctggtgcctt acttcgtccc caacaccccg gcagcccgag ccgacctggc cgctcagtac      720

accaccgtcg gccgcatgga ccaaggagtt ggactggtgc tccaggagct gcgtgacgcc      780

ggtgtcctga acgacacact ggtgatcttc acgtccgaca acgggatccc cttccccagc      840

ggcaggacca acctgtactg gccgggcact gctgaaccct tactggtgtc atccccggag      900

cacccaaaac gctggggcca agtcagcgag gcctacgtga gcctcctaga cctcacgccc      960

accatcttgg attggttctc gatcccgtac cccagctacg ccatctttgg ctcgaagacc     1020

atccacctca ctggccggtc cctcctgccg gcgctggagg ccgagcccct ctgggccacc     1080

gtctttggca gccagagcca ccacgaggtc accatgtcct accccatgcg ctccgtgcag     1140

caccggcact tccgcctcgt gcacaacctc aacttcaaga tgccctttcc catcgaccag     1200

gacttctacg tctcacccac cttccaggac ctcctgaacc gcaccacagc tggtcagccc     1260

acgggctggt acaaggacct ccgtcattac tactaccggg cgcgctggga gctctacgac     1320

cggagccggg acccccacga gacccagaac ctggccaccg acccgcgctt tgctcagctt     1380

ctggagatgc ttcgggacca gctggccaag tggcagtggg agacccacga cccctgggtg     1440

tgcgcccccg acggcgtcct ggaggagaag ctctctcccc agtgccagcc cctccacaat     1500

gagctgtga                                                             1509


<210>  4
<211>  3837
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  rAAV vector genome AAV.CB7.CI.hSGSHco.rBG


<220>
<221>  repeat_region
<222>  (1)..(130)
<223>  AAV2 5'ITR

<220>
<221>  promoter
<222>  (198)..(579)
<223>  CMV IE promoter

<220>
<221>  promoter
<222>  (582)..(863)
<223>  CB promoter

<220>
<221>  TATA_signal
<222>  (836)..(839)

<220>
<221>  Intron
<222>  (958)..(1930)
<223>  chicken beta-actin

<220>
<221>  CDS
<222>  (1948)..(3459)
<223>  Engineered nucleic acid sequence encoding human 
       N-sulfoglycosamine sulfohydrolase (hSGSH)

<220>
<221>  polyA_signal
<222>  (3493)..(3619)
<223>  Rabbit globin poly A (rBG, RBG)

<220>
<221>  repeat_region
<222>  (3708)..(3837)
<223>  AAV2 3'ITR

<400>  4
ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt       60

ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact      120

aggggttcct tgtagttaat gattaacccg ccatgctact tatctaccag ggtaatgggg      180

atcctctaga actatagcta gtcgacattg attattgact agttattaat agtaatcaat      240

tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa      300

tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt      360

tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggact atttacggta      420

aactgcccac ttggcagtac atcaagtgta tcatatgcca agtacgcccc ctattgacgt      480

caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttat gggactttcc      540

tacttggcag tacatctacg tattagtcat cgctattacc atggtcgagg tgagccccac      600

gttctgcttc actctcccca tctccccccc ctccccaccc ccaattttgt atttatttat      660

tttttaatta ttttgtgcag cgatgggggc gggggggggg ggggggcgcg cgccaggcgg      720

ggcggggcgg ggcgaggggc ggggcggggc gaggcggaga ggtgcggcgg cagccaatca      780

gagcggcgcg ctccgaaagt ttccttttat ggcgaggcgg cggcggcggc ggccctataa      840

aaagcgaagc gcgcggcggg cggggagtcg ctgcgacgct gccttcgccc cgtgccccgc      900

tccgccgccg cctcgcgccg cccgccccgg ctctgactga ccgcgttact cccacaggtg      960

agcgggcggg acggcccttc tcctccgggc tgtaattagc gcttggttta atgacggctt     1020

gtttcttttc tgtggctgcg tgaaagcctt gaggggctcc gggagggccc tttgtgcggg     1080

gggagcggct cggggggtgc gtgcgtgtgt gtgtgcgtgg ggagcgccgc gtgcggctcc     1140

gcgctgcccg gcggctgtga gcgctgcggg cgcggcgcgg ggctttgtgc gctccgcagt     1200

gtgcgcgagg ggagcgcggc cgggggcggt gccccgcggt gcgggggggg ctgcgagggg     1260

aacaaaggct gcgtgcgggg tgtgtgcgtg ggggggtgag cagggggtgt gggcgcgtcg     1320

gtcgggctgc aaccccccct gcacccccct ccccgagttg ctgagcacgg cccggcttcg     1380

ggtgcggggc tccgtacggg gcgtggcgcg gggctcgccg tgccgggcgg ggggtggcgg     1440

caggtggggg tgccgggcgg ggcggggccg cctcgggccg gggagggctc gggggagggg     1500

cgcggcggcc cccggagcgc cggcggctgt cgaggcgcgg cgagccgcag ccattgcctt     1560

ttatggtaat cgtgcgagag ggcgcaggga cttcctttgt cccaaatctg tgcggagccg     1620

aaatctggga ggcgccgccg caccccctct agcgggcgcg gggcgaagcg gtgcggcgcc     1680

ggcaggaagg aaatgggcgg ggagggcctt cgtgcgtcgc cgcgccgccg tccccttctc     1740

cctctccagc ctcggggctg tccgcggggg gacggctgcc ttcggggggg acggggcagg     1800

gcggggttcg gcttctggcg tgtgaccggc ggctctagag cctctgctaa ccatgttcat     1860

gccttcttct ttttcctaca gctcctgggc aacgtgctgg ttattgtgct gtctcatcat     1920

tttggcaaag aattcacgcg tgccacc atg agc tgc cct gtg cct gcc tgt tgt     1974
                              Met Ser Cys Pro Val Pro Ala Cys Cys         
                              1               5                           

gcc ctg ctg ctg gtg ctg gga ctg tgc aga gcc aga ccc aga aac gct       2022
Ala Leu Leu Leu Val Leu Gly Leu Cys Arg Ala Arg Pro Arg Asn Ala           
10                  15                  20                  25            

ctg ctg ctg ctg gcc gac gat ggc ggc ttt gag agc ggc gcc tac aac       2070
Leu Leu Leu Leu Ala Asp Asp Gly Gly Phe Glu Ser Gly Ala Tyr Asn           
                30                  35                  40                

aac agc gcc att gcc acc cct cac ctg gac gcc ctg gcc aga aga agc       2118
Asn Ser Ala Ile Ala Thr Pro His Leu Asp Ala Leu Ala Arg Arg Ser           
            45                  50                  55                    

ctg ctg ttc aga aac gcc ttc acc agc gtg tcc agc tgc agc cct agc       2166
Leu Leu Phe Arg Asn Ala Phe Thr Ser Val Ser Ser Cys Ser Pro Ser           
        60                  65                  70                        

aga gcc tct ctg ctg acc gga ctg cct cag cac cag aac ggg atg tac       2214
Arg Ala Ser Leu Leu Thr Gly Leu Pro Gln His Gln Asn Gly Met Tyr           
    75                  80                  85                            

ggc ctg cac cag gac gtg cac cac ttc aac agc ttc gac aaa gtg cgg       2262
Gly Leu His Gln Asp Val His His Phe Asn Ser Phe Asp Lys Val Arg           
90                  95                  100                 105           

agc ctg cca ctg ctg ctg tct cag gct ggc gtg cgg aca ggc atc atc       2310
Ser Leu Pro Leu Leu Leu Ser Gln Ala Gly Val Arg Thr Gly Ile Ile           
                110                 115                 120               

ggc aag aaa cac gtg ggc ccc gag aca gtg tac ccc ttc gac ttc gcc       2358
Gly Lys Lys His Val Gly Pro Glu Thr Val Tyr Pro Phe Asp Phe Ala           
            125                 130                 135                   

tac acc gaa gag aac ggc agc gtg ctg caa gtg ggc cgg aac atc acc       2406
Tyr Thr Glu Glu Asn Gly Ser Val Leu Gln Val Gly Arg Asn Ile Thr           
        140                 145                 150                       

cgg atc aaa ctg ctc gtg cgg aag ttc ctg cag acc cag gac gac cgg       2454
Arg Ile Lys Leu Leu Val Arg Lys Phe Leu Gln Thr Gln Asp Asp Arg           
    155                 160                 165                           

ccc ttc ttc ctg tac gtg gcc ttc cac gac ccc cac aga tgt ggc cac       2502
Pro Phe Phe Leu Tyr Val Ala Phe His Asp Pro His Arg Cys Gly His           
170                 175                 180                 185           

tcc cag cct cag tac ggc acc ttc tgc gag aag ttc ggc aac ggc gag       2550
Ser Gln Pro Gln Tyr Gly Thr Phe Cys Glu Lys Phe Gly Asn Gly Glu           
                190                 195                 200               

agc ggc atg ggc aga atc cct gat tgg acc ccc cag gcc tac gac ccc       2598
Ser Gly Met Gly Arg Ile Pro Asp Trp Thr Pro Gln Ala Tyr Asp Pro           
            205                 210                 215                   

ctg gat gtg ctg gtg ccc tac ttc gtg ccc aac acc cct gcc gcc aga       2646
Leu Asp Val Leu Val Pro Tyr Phe Val Pro Asn Thr Pro Ala Ala Arg           
        220                 225                 230                       

gcc gat ctg gcc gcc cag tat aca acc gtg ggc agg atg gat cag ggc       2694
Ala Asp Leu Ala Ala Gln Tyr Thr Thr Val Gly Arg Met Asp Gln Gly           
    235                 240                 245                           

gtg gga ctg gtg ctg cag gaa ctg agg gac gcc ggc gtg ctg aac gac       2742
Val Gly Leu Val Leu Gln Glu Leu Arg Asp Ala Gly Val Leu Asn Asp           
250                 255                 260                 265           

acc ctc gtg atc ttt acc agc gac aac ggc atc cca ttc ccc agc ggc       2790
Thr Leu Val Ile Phe Thr Ser Asp Asn Gly Ile Pro Phe Pro Ser Gly           
                270                 275                 280               

cgg acc aat ctg tac tgg cct gga aca gcc gag ccc ctg ctg gtg tct       2838
Arg Thr Asn Leu Tyr Trp Pro Gly Thr Ala Glu Pro Leu Leu Val Ser           
            285                 290                 295                   

agc cct gag cac cct aag aga tgg ggc cag gtg tcc gag gcc tac gtg       2886
Ser Pro Glu His Pro Lys Arg Trp Gly Gln Val Ser Glu Ala Tyr Val           
        300                 305                 310                       

tcc ctg ctg gat ctg acc ccc acc atc ctg gac tgg ttc agc atc ccc       2934
Ser Leu Leu Asp Leu Thr Pro Thr Ile Leu Asp Trp Phe Ser Ile Pro           
    315                 320                 325                           

tac ccc agc tac gcc atc ttc ggc tcc aag acc atc cac ctg acc ggc       2982
Tyr Pro Ser Tyr Ala Ile Phe Gly Ser Lys Thr Ile His Leu Thr Gly           
330                 335                 340                 345           

aga tct ctg ctg cct gcc ctg gaa gcc gaa cct ctg tgg gcc aca gtg       3030
Arg Ser Leu Leu Pro Ala Leu Glu Ala Glu Pro Leu Trp Ala Thr Val           
                350                 355                 360               

ttt ggc agc cag agc cac cac gaa gtg acc atg tcc tac ccc atg cgg       3078
Phe Gly Ser Gln Ser His His Glu Val Thr Met Ser Tyr Pro Met Arg           
            365                 370                 375                   

agc gtg cag cac cgg cac ttc aga ctg gtg cac aac ctg aac ttc aag       3126
Ser Val Gln His Arg His Phe Arg Leu Val His Asn Leu Asn Phe Lys           
        380                 385                 390                       

atg ccc ttc cca atc gac cag gac ttc tat gtg tcc cca acc ttc cag       3174
Met Pro Phe Pro Ile Asp Gln Asp Phe Tyr Val Ser Pro Thr Phe Gln           
    395                 400                 405                           

gac ctg ctg aac aga acc aca gcc ggc cag cct acc ggc tgg tac aag       3222
Asp Leu Leu Asn Arg Thr Thr Ala Gly Gln Pro Thr Gly Trp Tyr Lys           
410                 415                 420                 425           

gac ctg cgg cac tac tac tac cgg gcc aga tgg gag ctg tac gac aga       3270
Asp Leu Arg His Tyr Tyr Tyr Arg Ala Arg Trp Glu Leu Tyr Asp Arg           
                430                 435                 440               

agc agg gac ccc cac gag aca cag aac ctg gcc acc gac cct aga ttc       3318
Ser Arg Asp Pro His Glu Thr Gln Asn Leu Ala Thr Asp Pro Arg Phe           
            445                 450                 455                   

gcc cag ctg ctg gaa atg ctg cgg gac cag ctg gcc aag tgg cag tgg       3366
Ala Gln Leu Leu Glu Met Leu Arg Asp Gln Leu Ala Lys Trp Gln Trp           
        460                 465                 470                       

gag aca cac gac cct tgg gtg tgc gct cct gac ggg gtg ctg gaa gag       3414
Glu Thr His Asp Pro Trp Val Cys Ala Pro Asp Gly Val Leu Glu Glu           
    475                 480                 485                           

aag ctg agc cct cag tgc cag ccc ctg cac aac gag ctg tga tga           3459
Lys Leu Ser Pro Gln Cys Gln Pro Leu His Asn Glu Leu                       
490                 495                 500                               

ctcgaggacg gggtgaacta cgcctgagga tccgatcttt ttccctctgc caaaaattat     3519

ggggacatca tgaagcccct tgagcatctg acttctggct aataaaggaa atttattttc     3579

attgcaatag tgtgttggaa ttttttgtgt ctctcactcg gaagcaattc gttgatctga     3639

atttcgacca cccataatac ccattaccct ggtagataag tagcatggcg ggttaatcat     3699

taactacaag gaacccctag tgatggagtt ggccactccc tctctgcgcg ctcgctcgct     3759

cactgaggcc gggcgaccaa aggtcgcccg acgcccgggc tttgcccggg cggcctcagt     3819

gagcgagcga gcgcgcag                                                   3837


<210>  5
<211>  502
<212>  PRT
<213>  Artificial Sequence

<220>
<223>  Synthetic Construct

<400>  5

Met Ser Cys Pro Val Pro Ala Cys Cys Ala Leu Leu Leu Val Leu Gly 
1               5                   10                  15      


Leu Cys Arg Ala Arg Pro Arg Asn Ala Leu Leu Leu Leu Ala Asp Asp 
            20                  25                  30          


Gly Gly Phe Glu Ser Gly Ala Tyr Asn Asn Ser Ala Ile Ala Thr Pro 
        35                  40                  45              


His Leu Asp Ala Leu Ala Arg Arg Ser Leu Leu Phe Arg Asn Ala Phe 
    50                  55                  60                  


Thr Ser Val Ser Ser Cys Ser Pro Ser Arg Ala Ser Leu Leu Thr Gly 
65                  70                  75                  80  


Leu Pro Gln His Gln Asn Gly Met Tyr Gly Leu His Gln Asp Val His 
                85                  90                  95      


His Phe Asn Ser Phe Asp Lys Val Arg Ser Leu Pro Leu Leu Leu Ser 
            100                 105                 110         


Gln Ala Gly Val Arg Thr Gly Ile Ile Gly Lys Lys His Val Gly Pro 
        115                 120                 125             


Glu Thr Val Tyr Pro Phe Asp Phe Ala Tyr Thr Glu Glu Asn Gly Ser 
    130                 135                 140                 


Val Leu Gln Val Gly Arg Asn Ile Thr Arg Ile Lys Leu Leu Val Arg 
145                 150                 155                 160 


Lys Phe Leu Gln Thr Gln Asp Asp Arg Pro Phe Phe Leu Tyr Val Ala 
                165                 170                 175     


Phe His Asp Pro His Arg Cys Gly His Ser Gln Pro Gln Tyr Gly Thr 
            180                 185                 190         


Phe Cys Glu Lys Phe Gly Asn Gly Glu Ser Gly Met Gly Arg Ile Pro 
        195                 200                 205             


Asp Trp Thr Pro Gln Ala Tyr Asp Pro Leu Asp Val Leu Val Pro Tyr 
    210                 215                 220                 


Phe Val Pro Asn Thr Pro Ala Ala Arg Ala Asp Leu Ala Ala Gln Tyr 
225                 230                 235                 240 


Thr Thr Val Gly Arg Met Asp Gln Gly Val Gly Leu Val Leu Gln Glu 
                245                 250                 255     


Leu Arg Asp Ala Gly Val Leu Asn Asp Thr Leu Val Ile Phe Thr Ser 
            260                 265                 270         


Asp Asn Gly Ile Pro Phe Pro Ser Gly Arg Thr Asn Leu Tyr Trp Pro 
        275                 280                 285             


Gly Thr Ala Glu Pro Leu Leu Val Ser Ser Pro Glu His Pro Lys Arg 
    290                 295                 300                 


Trp Gly Gln Val Ser Glu Ala Tyr Val Ser Leu Leu Asp Leu Thr Pro 
305                 310                 315                 320 


Thr Ile Leu Asp Trp Phe Ser Ile Pro Tyr Pro Ser Tyr Ala Ile Phe 
                325                 330                 335     


Gly Ser Lys Thr Ile His Leu Thr Gly Arg Ser Leu Leu Pro Ala Leu 
            340                 345                 350         


Glu Ala Glu Pro Leu Trp Ala Thr Val Phe Gly Ser Gln Ser His His 
        355                 360                 365             


Glu Val Thr Met Ser Tyr Pro Met Arg Ser Val Gln His Arg His Phe 
    370                 375                 380                 


Arg Leu Val His Asn Leu Asn Phe Lys Met Pro Phe Pro Ile Asp Gln 
385                 390                 395                 400 


Asp Phe Tyr Val Ser Pro Thr Phe Gln Asp Leu Leu Asn Arg Thr Thr 
                405                 410                 415     


Ala Gly Gln Pro Thr Gly Trp Tyr Lys Asp Leu Arg His Tyr Tyr Tyr 
            420                 425                 430         


Arg Ala Arg Trp Glu Leu Tyr Asp Arg Ser Arg Asp Pro His Glu Thr 
        435                 440                 445             


Gln Asn Leu Ala Thr Asp Pro Arg Phe Ala Gln Leu Leu Glu Met Leu 
    450                 455                 460                 


Arg Asp Gln Leu Ala Lys Trp Gln Trp Glu Thr His Asp Pro Trp Val 
465                 470                 475                 480 


Cys Ala Pro Asp Gly Val Leu Glu Glu Lys Leu Ser Pro Gln Cys Gln 
                485                 490                 495     


Pro Leu His Asn Glu Leu 
            500         


<210>  6
<211>  736
<212>  PRT
<213>  Artificial Sequence

<220>
<223>  capsid protein VP1 of adeno-associated virus 9

<400>  6

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser 
1               5                   10                  15      


Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro 
            20                  25                  30          


Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro 
        35                  40                  45              


Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro 
    50                  55                  60                  


Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp 
65                  70                  75                  80  


Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala 
                85                  90                  95      


Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly 
            100                 105                 110         


Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro 
        115                 120                 125             


Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg 
    130                 135                 140                 


Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly 
145                 150                 155                 160 


Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr 
                165                 170                 175     


Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro 
            180                 185                 190         


Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly 
        195                 200                 205             


Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser 
    210                 215                 220                 


Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile 
225                 230                 235                 240 


Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu 
                245                 250                 255     


Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn 
            260                 265                 270         


Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg 
        275                 280                 285             


Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn 
    290                 295                 300                 


Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile 
305                 310                 315                 320 


Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn 
                325                 330                 335     


Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu 
            340                 345                 350         


Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro 
        355                 360                 365             


Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp 
    370                 375                 380                 


Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe 
385                 390                 395                 400 


Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu 
                405                 410                 415     


Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu 
            420                 425                 430         


Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser 
        435                 440                 445             


Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser 
    450                 455                 460                 


Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro 
465                 470                 475                 480 


Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn 
                485                 490                 495     


Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn 
            500                 505                 510         


Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys 
        515                 520                 525             


Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly 
    530                 535                 540                 


Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile 
545                 550                 555                 560 


Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser 
                565                 570                 575     


Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ala Gln Ala Gln 
            580                 585                 590         


Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln 
        595                 600                 605             


Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His 
    610                 615                 620                 


Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met 
625                 630                 635                 640 


Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala 
                645                 650                 655     


Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr 
            660                 665                 670         


Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln 
        675                 680                 685             


Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn 
    690                 695                 700                 


Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val 
705                 710                 715                 720 


Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu 
                725                 730                 735     


<210>  7
<211>  2211
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  nucleic acid sequence encoding capsid protein VP1 of 
       adeno-associated virus 9

<400>  7
atggctgccg atggttatct tccagattgg ctcgaggaca accttagtga aggaattcgc       60

gagtggtggg ctttgaaacc tggagcccct caacccaagg caaatcaaca acatcaagac      120

aacgctcgag gtcttgtgct tccgggttac aaataccttg gacccggcaa cggactcgac      180

aagggggagc cggtcaacgc agcagacgcg gcggccctcg agcacgacaa ggcctacgac      240

cagcagctca aggccggaga caacccgtac ctcaagtaca accacgccga cgccgagttc      300

caggagcggc tcaaagaaga tacgtctttt gggggcaacc tcgggcgagc agtcttccag      360

gccaaaaaga ggcttcttga acctcttggt ctggttgagg aagcggctaa gacggctcct      420

ggaaagaaga ggcctgtaga gcagtctcct caggaaccgg actcctccgc gggtattggc      480

aaatcgggtg cacagcccgc taaaaagaga ctcaatttcg gtcagactgg cgacacagag      540

tcagtcccag accctcaacc aatcggagaa cctcccgcag ccccctcagg tgtgggatct      600

cttacaatgg cttcaggtgg tggcgcacca gtggcagaca ataacgaagg tgccgatgga      660

gtgggtagtt cctcgggaaa ttggcattgc gattcccaat ggctggggga cagagtcatc      720

accaccagca cccgaacctg ggccctgccc acctacaaca atcacctcta caagcaaatc      780

tccaacagca catctggagg atcttcaaat gacaacgcct acttcggcta cagcaccccc      840

tgggggtatt ttgacttcaa cagattccac tgccacttct caccacgtga ctggcagcga      900

ctcatcaaca acaactgggg attccggcct aagcgactca acttcaagct cttcaacatt      960

caggtcaaag aggttacgga caacaatgga gtcaagacca tcgccaataa ccttaccagc     1020

acggtccagg tcttcacgga ctcagactat cagctcccgt acgtgctcgg gtcggctcac     1080

gagggctgcc tcccgccgtt cccagcggac gttttcatga ttcctcagta cgggtatctg     1140

acgcttaatg atggaagcca ggccgtgggt cgttcgtcct tttactgcct ggaatatttc     1200

ccgtcgcaaa tgctaagaac gggtaacaac ttccagttca gctacgagtt tgagaacgta     1260

cctttccata gcagctacgc tcacagccaa agcctggacc gactaatgaa tccactcatc     1320

gaccaatact tgtactatct ctcaaagact attaacggtt ctggacagaa tcaacaaacg     1380

ctaaaattca gtgtggccgg acccagcaac atggctgtcc agggaagaaa ctacatacct     1440

ggacccagct accgacaaca acgtgtctca accactgtga ctcaaaacaa caacagcgaa     1500

tttgcttggc ctggagcttc ttcttgggct ctcaatggac gtaatagctt gatgaatcct     1560

ggacctgcta tggccagcca caaagaagga gaggaccgtt tctttccttt gtctggatct     1620

ttaatttttg gcaaacaagg aactggaaga gacaacgtgg atgcggacaa agtcatgata     1680

accaacgaag aagaaattaa aactactaac ccggtagcaa cggagtccta tggacaagtg     1740

gccacaaacc accagagtgc ccaagcacag gcgcagaccg gctgggttca aaaccaagga     1800

atacttccgg gtatggtttg gcaggacaga gatgtgtacc tgcaaggacc catttgggcc     1860

aaaattcctc acacggacgg caactttcac ccttctccgc tgatgggagg gtttggaatg     1920

aagcacccgc ctcctcagat cctcatcaaa aacacacctg tacctgcgga tcctccaacg     1980

gccttcaaca aggacaagct gaactctttc atcacccagt attctactgg ccaagtcagc     2040

gtggagatcg agtgggagct gcagaaggaa aacagcaagc gctggaaccc ggagatccag     2100

tacacttcca actattacaa gtctaataat gttgaatttg ctgttaatac tgaaggtgta     2160

tatagtgaac cccgccccat tggcaccaga tacctgactc gtaatctgta a              2211


