python - Convert a Pandas DataFrame to bin frequencies -
By using pods, I know how to close a column, but how do I make multiple columns and then I'm struggling to understand the frequency of the compartment (frequency) because my dataframe has 20 columns. I know that I can do 20 times the same way for a single column, but I have to learn a new method Interests. Here are the top 4 columns of dataframe being displayed:
Percentile1 Percentile2 Percentile3 Percentile4 395 0.166667 0.266667 0.266667 0.133333 424 0.266667 0.266667 0.133333 0.032258 511 0.032258 0.129032 0.129032 0.387097 540 0.129032 0.129032 0.387097 0.612903 570 0.129032 0.387097 0.612903 0.741935 I have created the following bin array
output = ['0-10', '10 -20 ', '20 -30', '30 -40 ',' 40-50 ',' 50-60 ',' 60-70 ',' 70-80 ',' 80-90 ',' 90-100 '] Here is my desired output:
Percentage 1 percent 2 percent 3 percent 4 395 10-20 20-30 20-30 10-20 424 20-30 20-30 10-20 0- 10 511 0- 10 10-20 10-20 30-40 540 10-20 10-20 30-40 60-70 570 10-20 30-40 60-70 70-80 After this I will ideally calculate a frequency / value to get something like this: percentages 1 percent 2 percent 3 percent 4 0-10 frequency # 10-20 20-30 30-40 40-50 etc. . Any help would be greatly appreciated
I might want to do something like the following:
print df Percentile1 Percentile2 Percentile3 Percentile4 0 .1,66,667 0.2,66,667 0.266667 0.133333 1 0.266667 0.266667 0.133333 0.032258 2 0.032258 0.129032 0.129032 0.387097 3 0.129032 0.129032 0.387097 0.612903 4 0.129032 0.387097 0.612903 0.741935 now use apply and cut to create a new Detafrem that Dislebl Replaces percentile with bin (each column Applied and applies to): bins = xrange (0,110,10) new = df.apply (lambda x: PD K Series (PDs. Cut (X * 100, Dibns)) ) New prints 1 percent 2 percent 3 percent 4 0 (10, 20) (20, 30] (20, 30) (10, 20) 1 (20, 30) (20, 30) (10, 20) (0, 10) 2 (0, 10) (10, 20) (10, 20) (30, 40) 3 (10, 20) (10, 20) (30, 40) (60, 70) 4 (10 , 20) (30, 40] (60, 70) (70, 80) Use once again to get the frequency Count:
< code> Print new.apply (lambda x: x.value_co unts () / x.count ()) Percentile1 Percentile2 Percentile3 Percentile4 (0, 10] 0.2 NaN NaN 0.2 (10, 20] 0.6 0.4 0.4 0.2 (20, 30] 0.2 0.4 0.2 NaN (30, 40] N AN 0.2 0.2 0.2 (60, 70) NaN NaN 0.2 0.2 (70, 80) NaN NaN NaN 0.2 or value calculation:
print new. apply (lambda x: x.value_counts ()) Percentile1 Percentile2 Percentile3 Percentile4 (0, 10] 1 NaN NaN 1 (10, 20] 3 2 2 1 (20, 30] 1 2 1 NaN (30, 40] NaN 1 1 1 (60, 70) nan nan 1 1 (70, 80) nan nn 1 1 Another method is the intermediate dataframe (which I called new Is not just to make but rather to count the value in just one order go straight: df.apply (lambda x: pd.value_counts (pd.cut (x * 100, Cans)) Percentile1 Per centile2 Percentile3 Percentile4 (0, 10] 1 Nain Nain 1 (10, 20) 3 2 2 1 (20, 30] 1 2 1 Nain (30, 40) Nain 1 1 1 (60, 70) Nain Nain 1 1 (70 , 80) Na Nn Nayn 1
Comments
Post a Comment