pcre - Match or remove string that occurs multiple times within two strings with regex -
I have a large CSV export where the columns do not align because some values are accidentally inserted into multiple cells instead of one Fortunately, the price is between two unique stars, I am hoping to use regex to merge these values into one cell. Sample data is as follows:
"apple", "tap", "0", "0", "0", ",", "1", ",", "fruit" , "0", "0", "0", ",", "1", ",", ",", ",", "," "red", "sweet", "d $", " Object "" horse "," tap "," 0 "" 1 "," 0 "," 0 "," 0 "," 0 "," 0 "," 0 "," 0 "," "Zero", "0", "0", "0", ",", "1", ",", "
And the end of unearned values
"," $ d $ " I'm trying to find a regex The "," "0", "0", "0", ",", "1", ",", and "the", which will be removed in order to merge between those values, "Fruit, Red," D "," D "," Object "," Horse ", "Faucet", "0", "0", "0", ",", "1", ",", "animal, big, tail", "d", "object" "los angeles", "null "," 0 "," 0 "," 0 ",", "," 1 ",", "," city, california, blurry, entertainment "," d $ "," location "
You can do this: '$ pattern =' (? : "NULL", "0", "0", "0", "", "1", "", "| \ g) [^ (^ ?!)"] \ Kashmir "," (?! D \ $) ~ '; $ Csv = preg_replace ($ pattern, ',', $ csv); Pattern details:
~ # delimiter (?: "NULL", "0", "0", "0", ",", "1", ",", "| (?! ^) \ G # Anchor for the end of the last match) [^" Content between the # quotes # Kashmir # removes all the left from the match result "," # "," Is not followed by "," (?! D! $) #df ~ $ / meaning "anchor" which means "string" The beginning of "or" the end of the last match "I added the (?! ^ ^) to avoid the first case An entry point for the first match, "NULL", "0", "0", "0", ",", "1", ",", " The content between the quotes is matched, since \ K removes all from the left side of the match result, only "," is changed. In the next matches, the use of \ G as the entry point and continues in the nearest matches until ?! D \ $) is successful.
Comments
Post a Comment