1.标准化:列/最大值
While there are many ways to normalize data, one of the simplest ways is to divide all of the values in a column by that column's maximum value. This way, all of the columns will range from 0 to 1. To calculate the maximum value of a column, we use the Series.max()
method.
input
max_protein = food_info["Protein_(g)"].max()
normalized_protein = food_info["Protein_(g)"] / max_protein
print(normalized_protein.head(5))
output
0 0.009624
1 0.009624
2 0.003170
3 0.242301
4 0.263134
Name: Protein_(g), dtype: float64
2.列之间的加减
food_info["Normalized_Protein"] = food_info["Protein_(g)"] / food_info["Protein_(g)"].max()
food_info["Normalized_Fat"] = food_info["Lipid_Tot_(g)"] / food_info["Lipid_Tot_(g)"].max()
food_info["Norm_Nutr_Index"] = 2*food_info["Normalized_Protein"] + (-0.75*food_info["Normalized_Fat"])
3.创建一个新列
food_info["Normalized_Protein"] = normalized_protein
food_info["Normalized_Fat"] = normalized_fat
4.升降序排列文档:Dataframe.sort_values(‘YY’, ascending=True)
food_info.sort_values("Norm_Nutr_Index", inplace=True, ascending=False)
- inplace=True,不创建新的对象,直接在原始对象上尽心修改;
- inplace=False,在对原始对象进行修改,而会创建新的对象;
- ascending:
Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by.