python - Basic RegEx pattern throws non desired results -
 I have input like this (  I want to extract the first occurrence of each line to be two digit number 0-99. The desired output:    I do not want the fourth line matches because more than two points (in Spanish decimal point is a comma, and the thousands separator dot)    My approach is pattern   But the output I am getting:      I believe that this happens because     Specify that the decimal digits after the digits should not be, such as:     although  it still matches the number after Piagina you want to be more specific and more:    This denies, by the example, showing the word "D" after the match.       list mylist  name):   
 Ankatrodos 2 Inmubls Pagha 1 D1 Ancordado 1 Inmubls pay ¡live 1 D1 Ankatrodos 0 Inmubls Ankatrrados 1.931 Inmubls pay ¡live 1D 12 9 Ankonadros 12 Inmubls Página 1 de 1    
 [ '2', '1', '0', '12']    (\ d {1,2}) ,  With mask = re.compile ('\ d +') , then I take the first group with the  [[mask.search (item)). Items in my list for the group (0)]    ['2', '1', ' 0 ',' 1 ',' 12 ']    the first incident in Encodrados 1.931 embubbles. Página 1 de 12 9 is the string '1' that follows the word 'pygin' but I can fix this bug on my own.    proposed solution  
 Use negative Lukahed 
 (?! )    
 \ d {1, 2} (?! \.) < / Code>   
 ( \ D {1,2} (?! De | \.))    Online Examples:  
 Regex101  
 
Comments
Post a Comment