python - Basic RegEx pattern throws non desired results -
I have input like this ( I want to extract the first occurrence of each line to be two digit number 0-99. The desired output: I do not want the fourth line matches because more than two points (in Spanish decimal point is a comma, and the thousands separator dot) My approach is pattern But the output I am getting: I believe that this happens because Specify that the decimal digits after the digits should not be, such as: although it still matches the number after Piagina you want to be more specific and more: This denies, by the example, showing the word "D" after the match. list mylist name):
Ankatrodos 2 Inmubls Pagha 1 D1 Ancordado 1 Inmubls pay ¡live 1 D1 Ankatrodos 0 Inmubls Ankatrrados 1.931 Inmubls pay ¡live 1D 12 9 Ankonadros 12 Inmubls Página 1 de 1
[ '2', '1', '0', '12']
(\ d {1,2}) ,
With mask = re.compile ('\ d +') , then I take the first group with the
[[mask.search (item)). Items in my list for the group (0)]
['2', '1', ' 0 ',' 1 ',' 12 ']
the first incident in Encodrados 1.931 embubbles. Página 1 de 12 9 is the string '1' that follows the word 'pygin' but I can fix this bug on my own.
proposed solution
Use negative Lukahed
(?! )
\ d {1, 2} (?! \.) < / Code>
( \ D {1,2} (?! De | \.))
Online Examples:
Regex101
Comments
Post a Comment