💎一站式轻松地调用各大LLM模型接口,支持GPT4、智谱、星火、月之暗面及文生图 广告
[TOC] # 创建数据 ~~~ import pandas as pd example = pd.DataFrame({'Month': ["January", "January", "January", "January", "February", "February", "February", "February", "March", "March", "March", "March"], 'Category': ["Transportation", "Grocery", "Household", "Entertainment", "Transportation", "Grocery", "Household", "Entertainment", "Transportation", "Grocery", "Household", "Entertainment"], 'Amount': [74., 235., 175., 100., 115., 240., 225., 125., 90., 260., 200., 120.]}) print(example) ~~~ 输出 ~~~ Amount Category Month 0 74.0 Transportation January 1 235.0 Grocery January 2 175.0 Household January 3 100.0 Entertainment January 4 115.0 Transportation February 5 240.0 Grocery February 6 225.0 Household February 7 125.0 Entertainment February 8 90.0 Transportation March 9 260.0 Grocery March 10 200.0 Household March 11 120.0 Entertainment March ~~~ # 指定索引和列统计 索引index按照columns统计values ~~~ example_pivot = example.pivot(index='Category', columns='Month', values='Amount') print(example_pivot) ~~~ # 按照不同维度求和 ~~~ example_pivot = example.sum(axis=0) print(example_pivot) ~~~ 输出 ~~~ Amount 1959 Category TransportationGroceryHouseholdEntertainmentTra... Month JanuaryJanuaryJanuaryJanuaryFebruaryFebruaryFe... dtype: object ~~~ # 求平均 csv结构 ![](https://box.kancloud.cn/fd0c3fb8beac1e1740d1610710c9cdca_1770x612.png) ~~~ df = pd.read_csv('titanic.csv') table = df.pivot_table(index='Sex', columns='Pclass', values='Fare') print(table) ~~~ 输出 ~~~ Pclass 1 2 3 Sex female 106.125798 21.970121 16.118810 male 67.226127 19.741782 12.661633 ~~~ # 求最大 ~~~ df = pd.read_csv('titanic.csv') table = df.pivot_table(index='Sex', columns='Pclass', values='Fare', aggfunc='max') print(table) ~~~ 输出 ~~~ Pclass 1 2 3 Sex female 512.3292 65.0 69.55 male 512.3292 73.5 69.55 ~~~ # 统计 不同index在不同的columns里面他的value是怎么样(默认是求平均),也可以设置展示的是和或者max ~~~ df = pd.read_csv('titanic.csv') # table = df.pivot_table(index='Sex', columns='Pclass', values='Fare', aggfunc='count') table = df.pivot_table(index='Sex', columns='Pclass', values='Fare', aggfunc='count') print(table) ~~~ 输出 ~~~ Pclass 1 2 3 Sex female 94 76 144 male 122 108 347 ~~~ 这个结果也是下面这个 查看某个键在某个列上的分布情况 ~~~ df = pd.read_csv('titanic.csv') crosstab = pd.crosstab(index=df['Sex'], columns=df['Pclass']) print(crosstab) ~~~ **查看获救的情况** 查看不同的船仓,男性和女性获救的情况 ~~~ df = pd.read_csv('titanic.csv') table = df.pivot_table(index='Pclass', columns='Sex', values='Survived', aggfunc='mean') print(table) ~~~ 输出 ~~~ Sex female male Pclass 1 0.968085 0.368852 2 0.921053 0.157407 3 0.500000 0.135447 ~~~ **查看未成年人获救情况** ~~~ df = pd.read_csv('titanic.csv') df['Underaged'] = df['Age'] <= 18 table = df.pivot_table(index='Underaged', columns='Sex', values='Survived', aggfunc='mean') print(table) ~~~ 输出 ~~~ Sex female male Underaged False 0.760163 0.167984 True 0.676471 0.338028 ~~~