April 2, 2026 / 4 min read / Tools / Growing

Attaching final Playwright screenshots to PR comments

A useful PR review surface starts with one final screenshot per E2E test, a manifest the runner can consume, and selective artifact comments instead of raw report dumps.

Context

End-to-end UI tests often produce evidence that is technically available but operationally weak.

The report exists, screenshots exist, and traces exist, but the reviewer still has to leave the PR, find the artifact bundle, and reconstruct which image matters.

That is too much friction for a review loop that only needs a quick answer:

what did the final page state look like
which spec produced it
is the result relevant to the files that changed

The problem is not artifact generation alone. It is turning Playwright output into a review surface that fits directly inside the PR flow.

Decision / Insight

Treat final UI screenshots as review artifacts with their own pipeline, not as incidental test byproducts.

The useful flow is:

Playwright final-state capture -> manifest -> runner-side selection -> optional public upload -> PR comment

That order matters. The test repo is only responsible for producing stable evidence. The PR runner is responsible for deciding whether that evidence belongs in the review conversation.

This keeps the boundary clean:

the Playwright layer captures final screenshots consistently
the manifest exposes them in a machine-readable shape
the PR pipeline selects only artifacts relevant to the current change
the comment stays compact instead of dumping the full report surface

Breakdown

Options considered

Link only to the Playwright HTML report
- Easy to produce.
- Still forces the reviewer to navigate away from the PR and hunt for the meaningful screenshot.
Attach every screenshot generated by the run
- Maximizes visibility.
- Bloats the PR comment and weakens the signal when only a small part of the E2E suite matters to the change.
Capture one final screenshot per test and comment only the relevant subset
- Requires a small artifact pipeline.
- Produces a much better PR review surface.

Trade-offs

Final full-page screenshots add storage and upload steps, but make visual review faster.
Selecting artifacts from changed files reduces noise, but means the pipeline needs a reliable mapping from spec to screenshot.
Public uploads make GitHub image embedding simple, but require retention and permission handling outside the repo.
Fallback-to-local behavior keeps PR creation resilient, but produces a weaker review surface when upload fails.

Constraints

The pipeline stays useful only if it remains narrow:

capture the last visible page state, not every intermediate transition
write artifacts to a predictable repo-local path
emit a manifest the outer runner can parse without inference
select artifacts based on changed E2E specs or E2E infrastructure changes
keep PR comments compact with a limited number of inline images
do not block PR creation if artifact upload fails

Implementation

In the site repo, Playwright captures one final full-page screenshot after each E2E test and records a structured artifact entry with:

screenshot path
test title
test file
route
project name
creation timestamp

Those entries are collected under output/playwright/ and compiled into output/playwright/review-artifacts.json during global teardown.

That gives the outer runner a deterministic input instead of asking it to inspect Playwright internals or parse report HTML.

In Night Shift, the github-issues profile reads that manifest after the code change is complete and validations pass. It then:

loads all available review artifacts
selects only the artifacts tied to changed E2E specs, or all artifacts when shared E2E infrastructure changed
preserves the selected files inside the task directory
optionally uploads them to a public droplet path
comments on the new draft PR with one inline image plus links to the remaining uploaded artifacts

If upload is not configured or fails, the PR still gets created. The comment falls back to local preserved paths instead of embedded public images.

That fallback is important. The screenshot pipeline improves review quality, but it is not allowed to become a write-path dependency for opening the PR itself.

This also keeps the same architectural boundary used in Night Shift for GitHub issue backlogs: the runner owns the review surface, while the execution layer only produces bounded artifacts.

Reusable Takeaway

If Playwright screenshots are meant to help code review, treat them as first-class review artifacts.

A practical baseline is:

capture one final screenshot per E2E test
emit a manifest the outer pipeline can consume
select artifacts from changed files instead of posting everything
upload for inline PR embedding when available
fall back gracefully when uploads fail

The non-obvious improvement is not better screenshots. It is better placement.

Once the final UI state is visible directly in the PR comment, the evidence stops behaving like test exhaust and starts behaving like review input.