Plotting a Scatter Plot Grouped by Category in Python
Problem:
In Pandas, how can you create a scatter plot with markers representing data sorted into categories defined by a third column?
Solution:
To efficiently create a scatter plot grouped by category, use the plot function instead of scatter. This is because plot interprets the values in the third column as categories rather than numerical values.
Here's a step-by-step solution using plot:
For example:
import matplotlib.pyplot as plt import numpy as np import pandas as pd np.random.seed(1974) # Generate Data num = 20 x, y = np.random.random((2, num)) labels = np.random.choice(['a', 'b', 'c'], num) df = pd.DataFrame(dict(x=x, y=y, label=labels)) groups = df.groupby('label') # Plot fig, ax = plt.subplots() ax.margins(0.05) for name, group in groups: ax.plot(group.x, group.y, marker='o', linestyle='', ms=12, label=name) ax.legend() plt.show()
This will produce a scatter plot with markers categorized by the values in the 'label' column and a legend that identifies the categories.
Additionally, you can customize the appearance of the plot by adjusting the ax.margins() parameter, setting the marker size (ms), and specifying a color palette for the markers.
The above is the detailed content of How to create a scatter plot with markers grouped by category in Python?. For more information, please follow other related articles on the PHP Chinese website!