Legend has one entry for every combination of two parameters -- how can I simplify the legend? /u/HoskinZo Python Education

I have a pandas scatterplot with two parameters: type of intervention (ventilation, filtration, source control, or combination) and type of benefit (health, productivity, or both). Right now the legend is creating an entry to every combination of these parameters (so 15 entries)–how can I simplify my legend so it just shoes the dot colour for the intervention type and the dot shape for the benefit type (or two legends with just this)?

The legend is currently like:

ventilation – health

ventilation – productivity

ventilation – both

filtration – health

filtration – productivity

… and so on …

but I want it to be like:

ventilation

filtration

health

productivity

etc.

I want the legend to just have ventilation, filtration, source control, or combination, and health, productivity, or both (not all combinations of these parameters).

I tried using plt.close() before the loop in my code, but that didn’t fix the problem.

import pandas as pd import matplotlib.pyplot as plt import seaborn as sns df = pd.read_csv('merged_data.csv',index_col=False) sns.set_theme(style="ticks") x_values = [] y_values = [] errors = [] colors = [] markers = [] color_map = { 'ventilation': 'blue', 'filtration': 'green', 'filtration ': 'green', 'source control': 'orange', 'combination': 'purple' } marker_map = { 'health': 'o', # Circle 'productivity': 's', # Square 'both': '^' # Triangle } for idx, row in df.iterrows(): x_values.extend([row['citation']] * row['n']) y_values.extend([row['net']] * row['n']) colors.extend([color_map[row['type']]] * row['n']) markers.extend([marker_map[row['benefit']]] * row['n']) plt.figure(figsize=(10, 6)) for type_, color in color_map.items(): for benefit, marker in marker_map.items(): mask = (df['type'] == type_) & (df['benefit'] == benefit) plt.scatter(df['citation'][mask].repeat(df['n'][mask]).values, df['netnew'][mask].repeat(df['n'][mask]).values, color=color, marker=marker, zorder=1, alpha=0.6, s = 60, edgecolor=color, linewidth=0.8, label=f"{type_} - {benefit}") plt.xticks(rotation=90) plt.ylabel('Net Benefit ($/person/year)', fontsize=14) plt.title('Health and Indirect Benefits of Indoor Air Quality Interventions', fontsize=14) plt.legend(title='Type and Benefit', bbox_to_anchor=(1.05, 1), loc='upper left') plt.axhline(0, color='grey', linestyle='--', linewidth=1, label='y=0') plt.tight_layout() plt.show()

submitted by /u/HoskinZo
[link] [comments]

r/learnpython I have a pandas scatterplot with two parameters: type of intervention (ventilation, filtration, source control, or combination) and type of benefit (health, productivity, or both). Right now the legend is creating an entry to every combination of these parameters (so 15 entries)–how can I simplify my legend so it just shoes the dot colour for the intervention type and the dot shape for the benefit type (or two legends with just this)? The legend is currently like: ventilation – health ventilation – productivity ventilation – both filtration – health filtration – productivity … and so on … but I want it to be like: ventilation filtration health productivity etc. I want the legend to just have ventilation, filtration, source control, or combination, and health, productivity, or both (not all combinations of these parameters). I tried using plt.close() before the loop in my code, but that didn’t fix the problem. import pandas as pd import matplotlib.pyplot as plt import seaborn as sns df = pd.read_csv(‘merged_data.csv’,index_col=False) sns.set_theme(style=”ticks”) x_values = [] y_values = [] errors = [] colors = [] markers = [] color_map = { ‘ventilation’: ‘blue’, ‘filtration’: ‘green’, ‘filtration ‘: ‘green’, ‘source control’: ‘orange’, ‘combination’: ‘purple’ } marker_map = { ‘health’: ‘o’, # Circle ‘productivity’: ‘s’, # Square ‘both’: ‘^’ # Triangle } for idx, row in df.iterrows(): x_values.extend([row[‘citation’]] * row[‘n’]) y_values.extend([row[‘net’]] * row[‘n’]) colors.extend([color_map[row[‘type’]]] * row[‘n’]) markers.extend([marker_map[row[‘benefit’]]] * row[‘n’]) plt.figure(figsize=(10, 6)) for type_, color in color_map.items(): for benefit, marker in marker_map.items(): mask = (df[‘type’] == type_) & (df[‘benefit’] == benefit) plt.scatter(df[‘citation’][mask].repeat(df[‘n’][mask]).values, df[‘netnew’][mask].repeat(df[‘n’][mask]).values, color=color, marker=marker, zorder=1, alpha=0.6, s = 60, edgecolor=color, linewidth=0.8, label=f”{type_} – {benefit}”) plt.xticks(rotation=90) plt.ylabel(‘Net Benefit ($/person/year)’, fontsize=14) plt.title(‘Health and Indirect Benefits of Indoor Air Quality Interventions’, fontsize=14) plt.legend(title=’Type and Benefit’, bbox_to_anchor=(1.05, 1), loc=’upper left’) plt.axhline(0, color=’grey’, linestyle=’–‘, linewidth=1, label=’y=0’) plt.tight_layout() plt.show() submitted by /u/HoskinZo [link] [comments]