ENH: Add `force_suffixes` boolean argument to `pd.merge` #61498

kopytjuk · 2025-05-26T17:59:35Z

Motivation

Often, when working with wide (i.e. multiple columns) dataframes in exploratory, merging them leads to an even wider dataframe. Currently, the suffixes mechanism is only applied on equally named columns from both dataframes.

However, often developers alter the column names beforehand, or use solutions similar to the one suggested here.

Changes

This PR adds a force_suffixes boolean argument to pd.merge which applies the suffixes on all columns, no matter if they equally named or not.

The goal is to have the following:

df1 = pd.DataFrame({
                'ID': [1, 2, 3],
                'Value': ['A', 'B', 'C']
                })

  df2 = pd.DataFrame({
                  'ID': [2, 3, 4],
                  'Value': ['D', 'E', 'F']
              })

merged_df = pd.merge(df1, df2, on='ID', how="inner", suffixes=('_left', '_right'), force_suffixes=True)

# Goal:
expected = DataFrame([[2, 2, "B", "D"], [3, 3, "C", "E"]],
                                        columns=["ID_left", "Value_left", "ID_right", "Value_right"])

addresses Please add force_suffixes to pandas.merge() #17834
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

kopytjuk · 2025-05-26T18:03:54Z

Hey @mroeschke, can you please take a look at my if the direction is right for you (i.e. you are OK with an additional argument) before I will fix the failing tests, linting errors and adjust the documentation. Ty in advance!

mroeschke

Thanks for opening the PR, but I would say this feature needs more discussion and agreement from the core team before moving forward with a PR

datapythonista · 2025-06-02T18:24:06Z

merge already has a quite complex signature, and what you are trying to solve here ca be easily done in pandas with:

pd.merge(df1.add_suffix("_left"), df2.add_suffix("_right"))

Let me know if I'm missing something, but while seems some people would appreciate that, doesn't seem any core dev is excited about this, ans I understand why, since it makes a tricky method even trickier.

@TomAugspurger in the issue seems like you were a bit more positive than others about adding this when discussed some time ago. Would you move on with this PR? Otherwise let's close it.

kopytjuk · 2025-06-02T19:14:23Z

merge already has a quite complex signature, and what you are trying to solve here ca be easily done in pandas with:
pd.merge(df1.add_suffix("_left"), df2.add_suffix("_right"))
Let me know if I'm missing something, but while seems some people would appreciate that, doesn't seem any core dev is excited about this, ans I understand why, since it makes a tricky method even trickier.

Thanks for your feedback!

Let me motivate the additional flag approach. I agree with you, using add_suffix is a valid approach, which however adds complexity on the code of the uses, forcing them to pass additional arguments like left_on="uuid_left", right_on="id_right", which makes it even more complicated.

Using force_suffixes would make the joins with wide data frames in exploratory settings very easy, because people cannot remember 10 different column names for each of the participants.

However I also agree on the additional complexity of the upcoming implementation. The internal logic of renaming and returning columns is quite complex already, which is not easy to grasp, maintain and test.

I will wait upon your decision.

datapythonista · 2025-06-02T19:20:08Z

Thanks for the clarification. I didn't realize the suffix would be added to the columns to join and it'd make things more complex than just adding the add_suffix, which otherwise feels like a quite clean approach. There is clearly a trade-off here, I'm personally fine with the change even if adds some complexity, since there seems to be many users who would appreciate that.

@pandas-dev/pandas-core any opinion on adding a flag to pandas.merge to make the suffixes be added also to non-conflicting names?

Dr-Irv · 2025-06-03T19:48:03Z

@pandas-dev/pandas-core any opinion on adding a flag to pandas.merge to make the suffixes be added also to non-conflicting names?

I think this is a nice idea. If we default it to False, then the current behavior is preserved.

rhshadrach · 2025-06-04T01:03:35Z

This is a situation I've run into occasionally. It's a few lines of user code, and yes, you need to track what you're joining on. I don't think it's unreasonable for the onus to be on users here, but no objection to adding a flag.

Add "force_suffixes" flag to pd.merge

c49ed5e

kopytjuk changed the title ~~Add force_suffixes boolean argument to pd.merge~~ ENH: Add force_suffixes boolean argument to pd.merge May 26, 2025

kopytjuk added 2 commits May 26, 2025 20:24

Add a simple test for force-suffixes

8fa3cdb

Small refactor in merge tests

d1c651d

mroeschke requested changes May 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Add `force_suffixes` boolean argument to `pd.merge` #61498

ENH: Add `force_suffixes` boolean argument to `pd.merge` #61498

Uh oh!

kopytjuk commented May 26, 2025 •

edited

Loading

Uh oh!

kopytjuk commented May 26, 2025 •

edited

Loading

Uh oh!

mroeschke left a comment

Uh oh!

datapythonista commented Jun 2, 2025

Uh oh!

kopytjuk commented Jun 2, 2025

Uh oh!

datapythonista commented Jun 2, 2025

Uh oh!

Dr-Irv commented Jun 3, 2025

Uh oh!

rhshadrach commented Jun 4, 2025

Uh oh!

Uh oh!

Uh oh!

ENH: Add force_suffixes boolean argument to pd.merge #61498

Are you sure you want to change the base?

ENH: Add force_suffixes boolean argument to pd.merge #61498

Uh oh!

Conversation

kopytjuk commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Uh oh!

kopytjuk commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mroeschke left a comment

Choose a reason for hiding this comment

Uh oh!

datapythonista commented Jun 2, 2025

Uh oh!

kopytjuk commented Jun 2, 2025

Uh oh!

datapythonista commented Jun 2, 2025

Uh oh!

Dr-Irv commented Jun 3, 2025

Uh oh!

rhshadrach commented Jun 4, 2025

Uh oh!

Uh oh!

ENH: Add `force_suffixes` boolean argument to `pd.merge` #61498

ENH: Add `force_suffixes` boolean argument to `pd.merge` #61498

kopytjuk commented May 26, 2025 •

edited

Loading

kopytjuk commented May 26, 2025 •

edited

Loading