Coloring Scatter Plots by Column Values in Python
The versatility of ggplot2 in R allows for seamless assignment of colors to data points based on column values. This feature can also be replicated in Python using pandas dataframes and Matplotlib.
Using Pandas and Matplotlib
To map colors to values in Matplotlib, consider the following steps:
Here's an example implementation:
<code class="python">def dfScatter(df, xcol='Height', ycol='Weight', catcol='Gender'): fig, ax = plt.subplots() categories = np.unique(df[catcol]) colors = np.linspace(0, 1, len(categories)) colordict = dict(zip(categories, colors)) df["Color"] = df[catcol].apply(lambda x: colordict[x]) ax.scatter(df[xcol], df[ycol], c=df.Color) return fig</code>
Example Usage
Consider a dataframe with Height, Weight, and Gender columns. To create a scatter plot where colors are assigned based on the Gender column:
<code class="python">df = pd.DataFrame({'Height':np.random.normal(size=10), 'Weight':np.random.normal(size=10), 'Gender': ["Male","Male","Unknown","Male","Male", "Female","Did not respond","Unknown","Female","Female"]}) fig = dfScatter(df)</code>
This will generate a scatter plot where the Gender column determines the color of each data point.
The above is the detailed content of How to Assign Colors to Points in Scatter Plots Based on Column Values in Python?. For more information, please follow other related articles on the PHP Chinese website!