Devlog 14: Database Schema for Anonymous + Authenticated Users

7 min read
database architecture postgres

Devlog 14: Database Schema for Anonymous + Authenticated Users

One of the trickier design decisions was the database schema. I needed to support both anonymous and authenticated users, but they vote on the same matchups. How do I attribute votes?

I could have made two separate tables (anonymous_votes and user_votes), but that would make querying and aggregating a nightmare. Instead, I went with a single votes table that supports both.

Database schema for dual attribution - supporting both anonymous and authenticated users

The Core Problem

Every vote needs to be attributed to someone, but:

  • Anonymous users don't have user_ids
  • I can't require authentication (kills accessibility)
  • Need to prevent simple abuse
  • Must support seamless migration from anonymous to authenticated

The Solution: Dual Attribution

Here's the votes table schema:

CREATE TABLE votes (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  matchup_id UUID NOT NULL REFERENCES matchups(id),
  user_id UUID REFERENCES auth.users(id),
  device_id TEXT,
  side CHAR(1) NOT NULL CHECK (side IN ('a', 'b')),
  amount INTEGER NOT NULL DEFAULT 1,
  created_at TIMESTAMPTZ DEFAULT NOW(),

  -- Critical: one or the other must be set
  CHECK (
    (user_id IS NOT NULL AND device_id IS NULL) OR
    (user_id IS NULL AND device_id IS NOT NULL)
  )
);

The key is the CHECK constraint at the bottom. It enforces that every vote has EITHER a user_id OR a device_id, but never both and never neither.

Database schema structure - votes table with dual attribution fields and constraints

Why This Works

For anonymous users:

INSERT INTO votes (matchup_id, device_id, side, amount)
VALUES ('uuid-here', 'device-fingerprint', 'a', 10);

For authenticated users:

INSERT INTO votes (matchup_id, user_id, side, amount)
VALUES ('uuid-here', 'user-uuid', 'a', 10);

Same table, same structure, just different attribution fields.

Indexes for Performance

Votes need to be queried by matchup, by user, and by device:

CREATE INDEX idx_votes_matchup ON votes(matchup_id);
CREATE INDEX idx_votes_user ON votes(user_id) WHERE user_id IS NOT NULL;
CREATE INDEX idx_votes_device ON votes(device_id) WHERE device_id IS NOT NULL;
CREATE INDEX idx_votes_created ON votes(created_at);

The partial indexes (WHERE clauses) save space since user_id and device_id are mutually exclusive.

Aggregating Votes

To get total votes for a matchup:

SELECT
  side,
  SUM(amount) as total_votes,
  COUNT(DISTINCT COALESCE(user_id::text, device_id)) as unique_voters
FROM votes
WHERE matchup_id = $1
GROUP BY side;

This works whether votes come from user_ids or device_ids. The COALESCE handles both cases for counting unique voters.

Leaderboard Queries

For leaderboards, I need to rank voters:

SELECT
  COALESCE(user_id::text, device_id) as voter_id,
  COALESCE(
    (SELECT username FROM auth.users WHERE id = user_id),
    'Anonymous #' || LEFT(device_id, 8)
  ) as display_name,
  SUM(amount) as total_votes,
  side
FROM votes
WHERE matchup_id = $1
GROUP BY voter_id, user_id, device_id, side
ORDER BY total_votes DESC
LIMIT 10;

Anonymous users show as "Anonymous #abc12345", authenticated users show their username.

Migration on Signup

When an anonymous user creates an account, I migrate their votes:

UPDATE votes
SET user_id = $1, device_id = NULL
WHERE device_id = $2;

This reassigns all their anonymous votes to their new user account. Zero data loss.

Preventing Duplicate Attribution

The CHECK constraint prevents accidents like:

-- This will fail!
INSERT INTO votes (matchup_id, user_id, device_id, side, amount)
VALUES ('uuid', 'user-uuid', 'device-fp', 'a', 10);
-- Error: violates check constraint

Can't attribute the same vote to both a user and a device. Keeps data clean.

Alternative Approaches I Rejected

Option 1: Separate tables

CREATE TABLE anonymous_votes (...);
CREATE TABLE user_votes (...);

Problem: Every query needs UNION. Aggregation gets messy. Migration requires INSERT + DELETE.

Option 2: Always create user accounts

CREATE TABLE votes (
  user_id UUID NOT NULL  -- Always required
);

Problem: Forces signup. Kills the anonymous-first design.

Option 3: Polymorphic voter column

CREATE TABLE votes (
  voter_type TEXT,  -- 'user' or 'device'
  voter_id TEXT
);

Problem: Loses type safety. Can't use foreign keys. More error-prone.

The dual attribution with CHECK constraint is cleaner.

Gotchas I Hit

Issue 1: Forgotten WHERE clauses

Early on I wrote queries like:

SELECT COUNT(*) FROM votes WHERE user_id = $1;

This doesn't work for device_id votes. Had to change to:

SELECT COUNT(*) FROM votes
WHERE user_id = $1 OR device_id = $2;

Issue 2: JOIN complications

Joining with auth.users is tricky because device votes don't have user_ids:

-- Wrong - loses device votes
SELECT v.*, u.username
FROM votes v
JOIN auth.users u ON v.user_id = u.id;

-- Right - includes anonymous votes
SELECT v.*, u.username
FROM votes v
LEFT JOIN auth.users u ON v.user_id = u.id;

Always use LEFT JOIN when mixing attributed and anonymous data.

Performance Impact

The dual attribution adds minimal overhead:

  • CHECK constraint: evaluated at insert time only
  • Indexes: partial indexes keep them small
  • Queries: COALESCE is fast

In practice, vote queries take 5-20ms even with millions of rows.

Takeaway

Supporting both anonymous and authenticated users in one table requires careful schema design, but it's way cleaner than separate tables.

The CHECK constraint is the hero here. It enforces data integrity at the database level, so application bugs can't create invalid states.

PostgreSQL's partial indexes and COALESCE make querying both attribution types efficient and straightforward.

This pattern works great for any app with progressive authentication. Start anonymous, convert to authenticated, keep all data seamlessly.