"Renaming `user` when grep can't tell which model you mean"

dev.to

I stopped mid-task trying to add an editor field to the Article model.

The problem was the existing user field. Adding editor would leave Article with two foreign keys to User sitting side by side:

class Article(models.Model):
    user = models.ForeignKey(User, on_delete=models.CASCADE)
    editor = models.ForeignKey(User, on_delete=models.CASCADE, related_name='edited_articles')
Enter fullscreen mode Exit fullscreen mode

I could already see the code review comment. "What is user here? The author? The last person who touched it?"

The right move was to rename user to author first, then add editor. Leave it as-is and that ambiguity is baked into the codebase permanently. Every new developer who opens the model will have the same question. The field name is wrong and everyone who encounters it will eventually notice.

The problem was how much that rename scared me. user doesn't live only inside the Article model definition. It gets accessed as article.user in views, output as user_id in serializers, listed in admin's list_display, returned as a response key the frontend depends on. Every one of those places needs to change to author or author_id. The serializer's fields list, the frontend's expected key names — all of it.

Miss one and here's what happens: you renamed article.user to article.author in views.py but missed fields = ['user_id'] in the serializer. The API can no longer return author_id. The frontend's expecting user_id and that screen breaks. Or you run the migration and there's a reference still sitting somewhere — AttributeError: 'Article' object has no attribute 'user' starts appearing in production logs. The kind of bug you find after the migration is already live. The logs fill up, Slack lights up, someone asks if we should roll back.

So the right order is: find every reference first, then rename. That means knowing every file and every line before touching the model definition or running any migration. The obvious tool is grep.

grep -rn "\.user" --include="*.py" ./
Enter fullscreen mode Exit fullscreen mode

It returned 340 hits.

I ran colref on the same codebase. It returned 4 hits for Article --field user and 3 hits for Article --field user_id. Seven results total — every actual Article.user reference in the codebase.

340 became 7. The gap is what this article is about.

The same problem comes up with any field name that multiple models share. status, name, created_by, assigned_to — grep on a common field name returns results from every model that uses it, and there is no way to filter by which model you care about. The rename that looks straightforward from the model definition turns into a search problem the moment you try to verify what will break.

The Disambiguation Problem: Ten Models, One Field Name

grep returned 340 results because user is not specific to Article. Every model in the project that relates to a user has a user field. Here's where those 340 hits actually came from:

Model Hits
Comment.user 89
Order.user 62
Profile.user 58
Payment.user 47
other models 78
Article.user / Article.user_id 7

Seven out of 340. The other 333 are legitimate field accesses — not noise, not comments, not file extensions. Real code that correctly uses .user on those other models. Text search has no way to distinguish them from what you're looking for.

This is a different problem from the grep-noise issue that comes up when checking field deletions, where string literals, comments, and file paths inflate the count. That problem is covered in an earlier article in this series. Here the results are all real code. The issue is that user appears on ten models, and text search cannot tell which model any given obj.user belongs to.

You could try narrowing the search. Filter to files that import Article, or grep for article.user with a lowercase a expecting the variable name to match. Every refinement introduces a new failure mode. What if a view stores the article in a variable named obj or instance? What if it comes from get_object_or_404 and the variable name is different from what you searched for? A tighter grep pattern gives you more confidence in the hits it returns, but zero information about what it missed. You end up needing to verify your verification.

Checking all 340 results by hand isn't impossible but it's exactly the kind of task that gets postponed indefinitely. You open the first file, see that it's Comment.user, close it, open the next one — Order.user — and somewhere around result 30 you realize this is going to take the rest of the afternoon. You close the tab and tell yourself you'll do it tomorrow. Tomorrow you have other things. The rename doesn't happen and the ambiguity stays.

The field that should have been renamed six months ago is still called user. And the next time someone needs to add another user relationship to the model, the situation is even more tangled.

What you need is a search that understands which model you're asking about.

The Scope Filter: Telling colref Which Model You Mean

colref reads code as structure rather than text. It parses source files into an AST and targets attribute-access nodes — the nodes that represent obj.field in running code. String literals, comments, and template paths are invisible to it. Crucially, it accepts a --model flag.

Installation:

pipx install colref
# or
pip install colref
Enter fullscreen mode Exit fullscreen mode

For a ForeignKey field, two access patterns appear in real code:

article.user      # fetches the User object (triggers a JOIN)
article.user_id   # fetches the FK integer directly (no JOIN)
Enter fullscreen mode Exit fullscreen mode

Performance-conscious code often uses article.user_id to avoid the join when only the ID is needed. Both need to be found and updated, so run both:

# find obj.user accesses on Article instances
colref check --orm django --model Article --field user ./

# find obj.user_id accesses on Article instances
colref check --orm django --model Article --field user_id ./
Enter fullscreen mode Exit fullscreen mode

Output:

# --field user
app/views.py:58
app/api.py:23
app/admin.py:11
app/tests/test_views.py:34

# --field user_id
app/serializers.py:18
app/api.py:29
app/tests/test_api.py:41
Enter fullscreen mode Exit fullscreen mode

Seven results. Every one is an Article.user or Article.user_id access. Comment.user, Order.user, every other model's user field — excluded.

How does it tell the difference? colref reads the project's models.py files first and builds a map of which fields belong to which model. It confirms that Article has a field named user. Then it walks the AST looking for attribute-access nodes and applies the model context: comment.user is classified as a Comment instance access and filtered out. It's not searching for the string user — it's reading the syntax tree and reasoning about which model each access belongs to.

Seven results is a number you can act on. Open each file, verify the context, update the reference. At a reasonable pace that's fifteen to twenty minutes of work. For each result, the question is simple: is this code still active in production? If the access is inside a function you know is live, update it. If it's inside commented-out code or a block that clearly never runs, note it and move on. Seven results means you can make that judgment for each one without losing track of where you are.

Once you've gone through colref's output and updated every location, there are two categories colref doesn't cover: serializer fields lists and admin class attributes. Neither is detected because determining which model a string like 'user_id' refers to inside a fields = [...] list requires tracing class inheritance, which colref doesn't handle yet. Check those directly before running the migration:

grep -rn "user_id" --include="*.py" ./app/serializers.py ./app/admin.py ./app/forms.py
Enter fullscreen mode Exit fullscreen mode

These files are limited in number. In most projects you have one serializers file, one admin file, maybe one forms file. Opening each and searching for user_id takes a few minutes.

The Trap: One Wrong Keystroke Drops Your Data

After updating every reference colref found and checking the serializers manually, the next step is generating the migration:

python manage.py makemigrations --name rename_article_user_to_author
Enter fullscreen mode Exit fullscreen mode

Django detects that the old field is gone and a new one appeared, and asks:

Did you rename article.user to article.author (a ForeignKey)? [y/N]
Enter fullscreen mode Exit fullscreen mode

The default is N. If you're not paying attention and hit Enter, or answer n because you're not sure what Django is asking, the resulting migration looks like this:

# What you get if you answer n — do not use this
operations = [
    migrations.RemoveField(
        model_name='article',
        name='user',
    ),
    migrations.AddField(
        model_name='article',
        name='author',
        field=models.ForeignKey(...),
    ),
]
Enter fullscreen mode Exit fullscreen mode

RemoveField drops the user_id column. AddField creates a new empty author_id column. All the data that was in user_id — every article's author relationship — is gone. On a production database with existing rows, this means every Article.author is now null or missing, depending on whether the field allows null. If it doesn't allow null, the migration fails partway through. Either way, not a situation you want to be in.

Answer y. That generates a RenameField migration:

# What you get if you answer y — this is what you want
operations = [
    migrations.RenameField(
        model_name='article',
        old_name='user',
        new_name='author',
    ),
]
Enter fullscreen mode Exit fullscreen mode

RenameField renames the column in place. The data stays exactly where it is. Every row that had a user_id value now has that same value under author_id.

If you'd rather skip the interactive prompt and write the migration directly, use RenameField explicitly:

from django.db import migrations

class Migration(migrations.Migration):
    dependencies = [
        ("app", "0042_previous_migration"),
    ]

    operations = [
        migrations.RenameField(
            model_name="article",
            old_name="user",
            new_name="author",
        ),
    ]
Enter fullscreen mode Exit fullscreen mode

That's the safest approach: write the migration by hand, review it, apply it. No interactive prompt to accidentally misread.

One more check before applying: run the migration against your local database and confirm your test suite still passes. If a test factory is still setting user=... on Article instead of author=..., the tests will catch it here rather than in production. Factory and fixture files are another category colref doesn't cover — they often set fields using keyword arguments that look like function calls, not attribute accesses. A passing test suite after the migration is the confirmation that nothing was missed.

The API Surface: Renaming Propagates Past Python

Running python manage.py migrate isn't the end of the rename. The model field changed from user to author, which means the database column changed from user_id to author_id. Anything that reads data from the API and expected user_id in the response now receives author_id instead. That's a breaking change.

The places that need to change:

  • TypeScript type definitions that declare an Article interface with a userId field
  • API client code that reads response.data.user_id
  • OpenAPI schemas, and any client SDKs generated from them
  • Tests that check response payloads against user_id

A team that generates TypeScript types from the OpenAPI schema will have userId in dozens of components, not just one. A mobile app consuming the same API has its own model types. An analytics pipeline reading the JSON might extract user_id by name. None of these are visible from the Django codebase. The Python rename is only half the work — the other half is knowing what else consumed that response key and how quickly those consumers can be updated.

If the backend and frontend deploy together — same release, same CI run — update both in one change and deploy together. The rename on the Python side, the type definition update on the TypeScript side, deployed atomically. That's the cleanest path when it's available.

If backend and frontend deploy independently, there's a window between the backend deploy (which changes the response key from user_id to author_id) and the frontend deploy (which updates the code to read author_id). During that window the frontend is broken.

Django REST Framework's source parameter lets you keep the old response key temporarily:

class ArticleSerializer(serializers.ModelSerializer):
    # temporary bridge: API still returns user_id while the frontend catches up
    user_id = serializers.IntegerField(source='author_id', read_only=True)

    class Meta:
        model = Article
        fields = ['id', 'title', 'user_id', ...]
Enter fullscreen mode Exit fullscreen mode

With this in place, the API continues returning user_id after the migration while the frontend deploys its update. Once the frontend is deployed and no longer reads user_id, remove the explicit field declaration and update fields to include author_id instead.

More steps, but no breaking window. Teams doing continuous deployment find this easier to manage than trying to coordinate a simultaneous Python-and-frontend release: Python rename + migration in one deploy, frontend update in the next, serializer bridge removed in a third cleanup.

The Naming Debt: Why user Became a Liability

Having made it through the rename, it's worth asking how the situation came up in the first place.

user describes an implementation fact — "this is a ForeignKey to User" — without saying anything about what that relationship means. Is it the author? The assignee? The last editor? The name doesn't tell you. That gap becomes a problem the moment you need a second user relationship on the same model, and it becomes a grep problem the moment you need to check where the field is used.

user is also the most common ForeignKey name in Django codebases by a wide margin. Comment, Order, Payment, Profile — every model that touches users tends to call the field user. That's why grep returned 334 irrelevant results. The name isn't specific to any model or any relationship. It's the field name equivalent of naming a variable data.

What would have been different with author from the start?

grep -rn "\.author" --include="*.py" ./
Enter fullscreen mode Exit fullscreen mode

Far fewer models have an author ForeignKey. Results would stay in the dozens and remain checkable by hand. You might not have needed colref at all.

The principle for naming ForeignKey fields: use the name of the relationship, not the name of the target model. Instead of user, use author, assignee, reviewer, approver — whatever the relationship actually means in your domain. The field name should describe why the relationship exists, not just where it points.

The test is simple: can a developer reading the model definition understand the field's purpose without looking at any other file? author passes that test. user doesn't. A field named user on an Article model tells you the type of the related object but nothing about why the relationship exists. That gap grows more costly every time a new developer joins the team or a new relationship gets added to the model.

That's straightforward to apply on a new project, when no one has committed to a name yet. On an inherited codebase, a product that's been running for years, "it should have been author from the start" is not actionable. user is already in views, serializers, tests, and migrations. The rename has to happen, and the question is only how to do it without breaking anything in production.

That's where colref is useful. Seven results instead of 340. A fifteen-minute update instead of an indefinitely postponed task.

The rename that's been sitting on the backlog is usually not blocked by difficulty. It's blocked by the upfront cost of a check that feels too expensive to run. Reducing that cost from hours to minutes is what makes the work actually happen.

colref supports Django and Rails. For the full list of what it detects and the current roadmap, see github.com/shinagawa-web/colref.


Appendix: The rename procedure

Step 1: Find every reference.

colref check --orm django --model Article --field user ./
colref check --orm django --model Article --field user_id ./
Enter fullscreen mode Exit fullscreen mode

Update each location colref returns. Then check serializers, admin, and forms for string references colref does not detect:

grep -rn "user_id" --include="*.py" ./app/serializers.py ./app/admin.py ./app/forms.py
Enter fullscreen mode Exit fullscreen mode

For the full list of what colref detects and does not, see the Detection Patterns docs.

Step 2: Rename the field in the model.

class Article(models.Model):
    author = models.ForeignKey(User, on_delete=models.CASCADE, related_name='articles')
Enter fullscreen mode Exit fullscreen mode

Step 3: Generate the migration. Answer y.

python manage.py makemigrations --name rename_article_user_to_author
Enter fullscreen mode Exit fullscreen mode

When Django asks "Did you rename article.user to article.author?", answer y. n generates RemoveField + AddField and drops your data. y generates RenameField and keeps it.

Step 4: Run your tests locally, then apply.

python manage.py migrate
Enter fullscreen mode Exit fullscreen mode

Step 5: Handle the API surface.

Update TypeScript types, API clients, and OpenAPI schemas. If the frontend cannot deploy at the same time, add user_id = serializers.IntegerField(source='author_id') to the serializer as a temporary bridge, then remove it once the frontend has deployed.

Source: dev.to

arrow_back Back to News