I have a pandas scatterplot with two parameters: type of intervention (ventilation, filtration, source control, or combination) and type of benefit (health, productivity, or both). Right now the legend is creating an entry to every combination of these parameters (so 15 entries)–how can I simplify my legend so it just shoes the dot colour for the intervention type and the dot shape for the benefit type (or two legends with just this)?
The legend is currently like:
ventilation – health
ventilation – productivity
ventilation – both
filtration – health
filtration – productivity
… and so on …
but I want it to be like:
ventilation
filtration
health
productivity
etc.
I want the legend to just have ventilation, filtration, source control, or combination, and health, productivity, or both (not all combinations of these parameters).
I tried using plt.close() before the loop in my code, but that didn’t fix the problem.
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns df = pd.read_csv('merged_data.csv',index_col=False) sns.set_theme(style="ticks") x_values = [] y_values = [] errors = [] colors = [] markers = [] color_map = { 'ventilation': 'blue', 'filtration': 'green', 'filtration ': 'green', 'source control': 'orange', 'combination': 'purple' } marker_map = { 'health': 'o', # Circle 'productivity': 's', # Square 'both': '^' # Triangle } for idx, row in df.iterrows(): x_values.extend([row['citation']] * row['n']) y_values.extend([row['net']] * row['n']) colors.extend([color_map[row['type']]] * row['n']) markers.extend([marker_map[row['benefit']]] * row['n']) plt.figure(figsize=(10, 6)) for type_, color in color_map.items(): for benefit, marker in marker_map.items(): mask = (df['type'] == type_) & (df['benefit'] == benefit) plt.scatter(df['citation'][mask].repeat(df['n'][mask]).values, df['netnew'][mask].repeat(df['n'][mask]).values, color=color, marker=marker, zorder=1, alpha=0.6, s = 60, edgecolor=color, linewidth=0.8, label=f"{type_} - {benefit}") plt.xticks(rotation=90) plt.ylabel('Net Benefit ($/person/year)', fontsize=14) plt.title('Health and Indirect Benefits of Indoor Air Quality Interventions', fontsize=14) plt.legend(title='Type and Benefit', bbox_to_anchor=(1.05, 1), loc='upper left') plt.axhline(0, color='grey', linestyle='--', linewidth=1, label='y=0') plt.tight_layout() plt.show()
submitted by /u/HoskinZo
[link] [comments]
r/learnpython I have a pandas scatterplot with two parameters: type of intervention (ventilation, filtration, source control, or combination) and type of benefit (health, productivity, or both). Right now the legend is creating an entry to every combination of these parameters (so 15 entries)–how can I simplify my legend so it just shoes the dot colour for the intervention type and the dot shape for the benefit type (or two legends with just this)? The legend is currently like: ventilation – health ventilation – productivity ventilation – both filtration – health filtration – productivity … and so on … but I want it to be like: ventilation filtration health productivity etc. I want the legend to just have ventilation, filtration, source control, or combination, and health, productivity, or both (not all combinations of these parameters). I tried using plt.close() before the loop in my code, but that didn’t fix the problem. import pandas as pd import matplotlib.pyplot as plt import seaborn as sns df = pd.read_csv(‘merged_data.csv’,index_col=False) sns.set_theme(style=”ticks”) x_values = [] y_values = [] errors = [] colors = [] markers = [] color_map = { ‘ventilation’: ‘blue’, ‘filtration’: ‘green’, ‘filtration ‘: ‘green’, ‘source control’: ‘orange’, ‘combination’: ‘purple’ } marker_map = { ‘health’: ‘o’, # Circle ‘productivity’: ‘s’, # Square ‘both’: ‘^’ # Triangle } for idx, row in df.iterrows(): x_values.extend([row[‘citation’]] * row[‘n’]) y_values.extend([row[‘net’]] * row[‘n’]) colors.extend([color_map[row[‘type’]]] * row[‘n’]) markers.extend([marker_map[row[‘benefit’]]] * row[‘n’]) plt.figure(figsize=(10, 6)) for type_, color in color_map.items(): for benefit, marker in marker_map.items(): mask = (df[‘type’] == type_) & (df[‘benefit’] == benefit) plt.scatter(df[‘citation’][mask].repeat(df[‘n’][mask]).values, df[‘netnew’][mask].repeat(df[‘n’][mask]).values, color=color, marker=marker, zorder=1, alpha=0.6, s = 60, edgecolor=color, linewidth=0.8, label=f”{type_} – {benefit}”) plt.xticks(rotation=90) plt.ylabel(‘Net Benefit ($/person/year)’, fontsize=14) plt.title(‘Health and Indirect Benefits of Indoor Air Quality Interventions’, fontsize=14) plt.legend(title=’Type and Benefit’, bbox_to_anchor=(1.05, 1), loc=’upper left’) plt.axhline(0, color=’grey’, linestyle=’–‘, linewidth=1, label=’y=0’) plt.tight_layout() plt.show() submitted by /u/HoskinZo [link] [comments]
I have a pandas scatterplot with two parameters: type of intervention (ventilation, filtration, source control, or combination) and type of benefit (health, productivity, or both). Right now the legend is creating an entry to every combination of these parameters (so 15 entries)–how can I simplify my legend so it just shoes the dot colour for the intervention type and the dot shape for the benefit type (or two legends with just this)?
The legend is currently like:
ventilation – health
ventilation – productivity
ventilation – both
filtration – health
filtration – productivity
… and so on …
but I want it to be like:
ventilation
filtration
health
productivity
etc.
I want the legend to just have ventilation, filtration, source control, or combination, and health, productivity, or both (not all combinations of these parameters).
I tried using plt.close() before the loop in my code, but that didn’t fix the problem.
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns df = pd.read_csv('merged_data.csv',index_col=False) sns.set_theme(style="ticks") x_values = [] y_values = [] errors = [] colors = [] markers = [] color_map = { 'ventilation': 'blue', 'filtration': 'green', 'filtration ': 'green', 'source control': 'orange', 'combination': 'purple' } marker_map = { 'health': 'o', # Circle 'productivity': 's', # Square 'both': '^' # Triangle } for idx, row in df.iterrows(): x_values.extend([row['citation']] * row['n']) y_values.extend([row['net']] * row['n']) colors.extend([color_map[row['type']]] * row['n']) markers.extend([marker_map[row['benefit']]] * row['n']) plt.figure(figsize=(10, 6)) for type_, color in color_map.items(): for benefit, marker in marker_map.items(): mask = (df['type'] == type_) & (df['benefit'] == benefit) plt.scatter(df['citation'][mask].repeat(df['n'][mask]).values, df['netnew'][mask].repeat(df['n'][mask]).values, color=color, marker=marker, zorder=1, alpha=0.6, s = 60, edgecolor=color, linewidth=0.8, label=f"{type_} - {benefit}") plt.xticks(rotation=90) plt.ylabel('Net Benefit ($/person/year)', fontsize=14) plt.title('Health and Indirect Benefits of Indoor Air Quality Interventions', fontsize=14) plt.legend(title='Type and Benefit', bbox_to_anchor=(1.05, 1), loc='upper left') plt.axhline(0, color='grey', linestyle='--', linewidth=1, label='y=0') plt.tight_layout() plt.show()
submitted by /u/HoskinZo
[link] [comments]