Sustituyendo patrones de texto

Tenemos esta columna con estos valores: 

[1] "disorderType=1q21.1%20susceptibility%20locus%20for%20Thrombocytopenia-Abs...                 
[2] "disorderType=1p36%20microdeletion%20syndrome"                                                                                 
[3] "disorderType=1q21.1%20recurrent%20microdeletion%20(susceptility%20locus%2...              
[4] "disorderType=1q21.1%20recurrent%20microduplication%20(possible%20suscepti...
[5] "disorderType=WAGR%2011p13%20deletion%20syndrome"                                                                              
[6] "disorderType=Potocki-Shaffer%20syndrome"                                                                                      
[7] "disorderType=12q14%20microdeletion%20syndrome"                                                                                
[8] "disorderType=Prader-Willi%20syndrome%20(Type%201)"                                                                            
[9] "disorderType=Angelman%20syndrome%20(Type%201)"                                                                                
[10] "disorderType=Prader-Willi%20Syndrome%20(Type%202)" 

Queremos dejar el nombre del síndrome sin el prefijo disorderType, y quitar los "%20" para que queden espacios limpios.
Usaremos gsub() para hacer sustituciones. Le pasamos tres parámetros, el primer es el patrón a buscar, el segundo el patrón por el que sustituiremos el patron encontrado, y el tercero, la variable donde aplicar estas sustituciones. 

Procedemos primero a eliminar el prefijo:

> decipherDisorders$disorder <- gsub("disorderType=*","",decipherDisorders$disorder)
[1] "1q21.1%20susceptibility%20locus%20for%20Thrombocytopenia-Absent%20Radius%...                     
[2] "1p36%20microdeletion%20syndrome"                                                                                 
[3] "1q21.1%20recurrent%20microdeletion%20(susceptility%20locus%20for%20neurod...              
[4] "1q21.1%20recurrent%20microduplication%20(possible%20susceptiblity%20locus...
[5] "WAGR%2011p13%20deletion%20syndrome"                                                                              
[6] "Potocki-Shaffer%20syndrome"                                                                                      
[7] "12q14%20microdeletion%20syndrome"                                                                                
[8] "Prader-Willi%20syndrome%20(Type%201)"                                                                            
[9] "Angelman%20syndrome%20(Type%201)"                                                                                
[10] "Prader-Willi%20Syndrome%20(Type%202)"        

Y despues a sustituir los %20 por espacios en blanco:

> decipherDisorders$disorder <- gsub("*%20*"," ",decipherDisorders$disorder)
[1] "1q21.1 susceptibility locus for Thrombocytopenia-Absent Radius (TAR) synd...                   
[2] "1p36 microdeletion syndrome"                                                                     
[3] "1q21.1 recurrent microdeletion (susceptility locus for neurodevelopmental...          
[4] "1q21.1 recurrent microduplication (possible susceptiblity locus for neuro...
[5] "WAGR 11p13 deletion syndrome"                                                                    
[6] "Potocki-Shaffer syndrome"                                                                        
[7] "12q14 microdeletion syndrome"                                                                    
[8] "Prader-Willi syndrome (Type 1)"                                                                  
[9] "Angelman syndrome (Type 1)"                                                                      
[10] "Prader-Willi Syndrome (Type 2)"  

Categorias: 

Últimos workarounds

Últimos Short Tips

Puedes encontrarme en

Sites Relacionados