Why fonts might be biasing your WPM
If your typing test quietly swaps the prompt’s font, your score can shift even if your motor skill doesn’t. Reading ease changes eye movements, error perception, and pacing—so your WPM and accuracy can rise or fall for reasons that have nothing to do with your fingers. That’s a test design problem we can fix with controlled experiments.
Below is a practical plan to A/B test three typography levers—typeface class (serif/sans/mono), programming ligatures (on/off), and line length—on both prose and code passages, plus what current research suggests you’ll see.
What the research says (in brief)
- Monospace vs proportional. A classic legibility study found proportional spacing yielded faster reading than monospaced (typewriter‑style) text, with no comprehension loss—evidence that fixed widths can slow reading when content is prose. (journals.uc.edu)
- Eye movements differ by spacing. Eye‑tracking work that directly compared a monospaced font (Courier) to a proportional one (Arial) reported more fixations per word in the monospaced condition—consistent with slightly harder visual parsing, even when overall speed is similar. (sciencedirect.com)
- Line length matters. In one influential on‑screen reading experiment, wider lines (about 80 characters per line) were read roughly 25–30% faster than narrow lines (about 40 characters), with comprehension steady. Other studies show mixed outcomes: one online news study found 95 characters per line was read fastest, while user preferences clustered at the extremes. The practical takeaway is to test within a reasonable window rather than assume a single magic number. (journals.sagepub.com)
- There are exceptions. For readers with certain low‑vision conditions (e.g., AMD), some studies show monospaced designs like Courier can support better acuity than common proportional faces like Arial at chart sizes—another reason to measure, not guess. (sciencedirect.com)
- Ligatures are visual, not textual. Programming ligatures are OpenType glyph substitutions driven by the GSUB table; they don’t change the underlying characters in your code or the prompt. Fira Code’s documentation and the OpenType spec both make this explicit. (firacode.com)
Designing a clean experiment on your typing site
Run separate experiments for prose and code. Readers process continuous prose differently than code, and monospaced faces have strong affordances in editors. Keep the three levers below independent so you can see main effects and interactions.
1) Typeface class (serif, sans, mono)
- Between‑subjects or crossover? If your traffic is high, use a randomized between‑subjects A/B/C to avoid learning effects. If traffic is modest, use a within‑subjects crossover (each user sees multiple conditions) but randomize order and insert warm‑up lines to reduce carryover.
- Representative families: for sans, pick a screen‑tuned face; for serif, a neutral text serif; for mono, a developer font such as Cascadia Code or JetBrains Mono. These were designed for code environments and support ligatures in many editors. (learn.microsoft.com)
2) Programming ligatures (on vs off)
- Where to test: Only in the code track. Toggle ligatures via CSS `font-variant-ligatures` or `font-feature-settings` for the monospaced family. Record whether users perceive ligatures as clearer or more error‑prone when scanning operators.
- Why it matters: Ligatures visually compress sequences like `!==` into a single glyph. They do not alter the stored text, but the visual change could affect error detection (perceived difficulty) even if accuracy is unchanged—an ideal hypothesis to test. (learn.microsoft.com)
3) Line length (measure)
- Test at least three max‑widths: 45ch, 60ch, 75ch. The `ch` unit tracks the width of the zero glyph in the current font, letting you set consistent characters‑per‑line across fonts and devices with `max-width: 60ch`. Research suggests performance may rise on longer lines, while many users still prefer moderate measures—so collect both speed and preference. (developer.mozilla.org)
Passages, devices, and controls
- Content: Curate difficulty‑balanced prose snippets (neutral topics, 120–180 words) and code snippets (10–20 lines, mixed symbols). Keep lexical difficulty and operator density consistent across variants.
- Devices: Stratify results by viewport class (mobile, tablet, desktop). Reading behavior—and thus typing—changes with screen size.
- Font size and line height: Lock to a sensible baseline (e.g., 16–18 px equivalent and 1.4–1.6 line height) across conditions.
Metrics to log
- Speed: WPM for correctly transcribed characters only.
- Accuracy: character error rate (CER), backspace rate, and error types (substitutions vs omissions). For code, flag operator‑related errors separately (e.g., confusion between `!=` and `!==`).
- Perceived difficulty: a one‑question 5‑point rating after each test: How visually easy was this prompt to read?
- Eye‑movement proxies: If you can’t run eye‑tracking, longer dwell near operator clusters or line breaks can be approximated by pauses and backspaces—useful when judging ligature effects.
What you might find (hypotheses to validate)
- Prose, monospace vs proportional: Expect proportional faces to yield slightly higher WPM and/or fewer fixations (indirectly captured by backspaces/pauses), mirroring research that shows monospaced text induces more fixations per word. Accuracy should be similar if the prompt is clean. (sciencedirect.com)
- Code, ligatures on vs off: Expect small or null changes in raw speed, but shifts in perceived difficulty—some users report operators are easier to parse; others feel ambiguity around multi‑char tokens. That’s precisely why a toggle and a survey help separate preference from performance. The underlying text remains identical either way. (learn.microsoft.com)
- Line length: On desktop, longer measures (60–75ch) may boost raw speed, while many users still prefer moderate lines around 50–75 characters per line. On mobile, aim shorter. Validate with your data and segment by device. (journals.sagepub.com)
Practical setup tips (actionable)
- Use `max-width: 60ch` on the prompt container to pin a mid‑range measure; test 45ch and 75ch variants. The `ch` unit is defined by the width of the zero glyph in the active font. (developer.mozilla.org)
- Keep copy the same across variants and randomize assignment at the session level to avoid users cherry‑picking easy prompts.
- Run an A/A test first to verify your instrumentation; your WPM distributions should overlap.
- Pre‑register metrics and analysis (speed, CER, perceived difficulty) to prevent fishing for effects once results arrive.
- For accessibility, always provide a user override to pick a preferred font and toggle ligatures, even if your default is data‑driven. Research and individual vision vary; for example, readers with AMD have shown better acuity on monospaced charts like Courier than on Arial. (sciencedirect.com)
Bottom line recommendations
- Default fonts: For prose passages, a well‑hinted proportional sans or serif is a safe default; for code, a modern monospaced code font is appropriate. If you support ligatures, make them opt‑in with a visible toggle.
- Default line length: Start near 60–70 characters per line on desktop, tighten on mobile, and let your A/B tests confirm where your users land for speed, accuracy, and comfort. External syntheses and experiments commonly place optimal ranges around 50–75 CPL—but your audience, language mix, and screen sizes matter. (journals.sagepub.com)
Why this matters for fairness
If font choices systematically raise or lower WPM on your site, leaderboards and personal bests reflect formatting, not skill. Treat typography like any other test variable, measure it, and publish your defaults and rationale. That transparency builds trust—and helps your users type their true best.
---
References (selected): Duchnicky & Kolers on line length; Shaikh & Chaparro on online article measures; classic proportional vs monospaced readability findings; eye‑movement differences for mono vs proportional; OpenType GSUB and Fira Code’s clarification that ligatures are glyph substitutions only; MDN on the `ch` unit for controlling measure. (journals.sagepub.com)