DOCX Cleanup for Clean Exports

DOCX imports can carry years of hidden formatting debt. This cleanup playbook prioritizes structure, paragraph integrity, and heading consistency so final PDF and EPUB exports stay predictable.

Run structure cleanup before visual cleanup

Start by fixing document structure: chapter headings, scene separators, and front/back matter boundaries.

If structure is unstable, every later formatting pass produces noisy warnings and false regressions.

Do not tune typography first. Lock structure first, then clean presentation-level issues.

Normalize heading hierarchy and remove duplicates

Keep one heading pattern per level (for example: chapter = Heading 1, scene = body text plus divider).

Delete duplicate chapter lines created by copy/paste imports. Duplicate headings create TOC jumps and navigation drift.

If a chapter title appears as plain text plus heading style in two places, keep only one authoritative heading line.

Remove manual spacing and tab hacks

Replace repeated blank lines, tabs-for-indent, and manual line breaks with normal paragraph styling.

Legacy DOCX files often use visual hacks that collapse differently across print and EPUB outputs.

Any spacing achieved with repeated returns is fragile; convert it to style-based paragraph spacing.

Clear hidden artifacts that break export flow

Turn on hidden characters in your source editor and remove stray section breaks, page breaks, and mixed list markers.

Watch for pasted web content carrying inline style noise and nonstandard characters.

If one chapter behaves differently, inspect it for direct formatting overrides before changing global settings.

Re-export once, then triage with a failure-mode checklist

Generate one fresh release-candidate export after cleanup and inspect chapter starts, TOC links, and paragraph rhythm.

Classify every defect as structure, spacing, heading, or artifact issue and fix the source DOCX only.

Avoid patching final export files. Source fixes keep future revisions stable.

Numbers and Reference Tables

Top DOCX import failure modes and first fixes

Failure mode How to detect quickly First source-first fix
Duplicate chapter headings TOC points to repeated titles or wrong chapter anchors. Keep one Heading 1 per chapter and remove duplicates.
Manual blank-line spacing Large vertical jumps vary between chapters. Replace repeated returns with paragraph spacing styles.
Tab-based paragraph indents First lines indent inconsistently in export. Remove tabs and set first-line indent in paragraph style.
Hidden page/section breaks Unexpected blank pages before chapter starts. Reveal hidden marks and remove unintended breaks.
Mixed list marker styles Bullets/numbering shift style mid-section. Normalize list style in source and reapply once.
Inline font overrides from pasted text Random font changes inside body paragraphs. Clear direct formatting and reapply body style.
Soft line breaks inside paragraphs Paragraph wraps strangely in EPUB output. Replace manual line breaks with normal paragraph flow.
Scene divider copied as image Divider alignment/size drifts across formats. Use text-divider or consistent ornament style token.
Inconsistent quote punctuation characters Search/replace misses quote variants. Normalize smart quotes and apostrophes globally.
Broken heading levels in front matter TOC includes title/copyright pages unexpectedly. Use body style for non-TOC front matter lines.

Publish Checklist

  1. Lock one heading hierarchy before any visual styling changes.
  2. Remove duplicate chapter headings and duplicate anchor text.
  3. Replace tab indents and repeated blank lines with paragraph styles.
  4. Show hidden characters and remove unintended page/section breaks.
  5. Clear direct formatting from pasted web or email content.
  6. Normalize list styles and divider conventions.
  7. Run find/replace for inconsistent quote and dash characters.
  8. Export one release-candidate file after cleanup changes.
  9. Verify TOC targets, chapter starts, and paragraph spacing in export.
  10. Record resolved issues in release notes before final formatting pass.

Warning-to-Fix Map

Warning pattern: TOC links land on wrong chapter

Fix: Remove duplicate heading lines and normalize chapter heading levels.

Verify: Every TOC link lands on the expected chapter heading in order.

Warning pattern: unexpected blank pages near chapter starts

Fix: Remove hidden section/page breaks and lock chapter-start policy.

Verify: Only intentional parity blanks remain after re-export.

Warning pattern: paragraph spacing varies chapter to chapter

Fix: Replace manual blank lines with style-based paragraph spacing.

Verify: Body rhythm is consistent on multi-chapter spot checks.

Warning pattern: random font changes in body text

Fix: Clear direct formatting and reapply one body-text style.

Verify: Font properties remain consistent across sample chapters.

Warning pattern: list formatting breaks after import

Fix: Normalize list style definitions and reapply list blocks.

Verify: Bulleted/numbered lists render consistently in export.

Proof Checks

Cleanup proof packet

  • Capture one screenshot of normalized heading structure in your editor outline.
  • Capture one screenshot showing hidden-mark cleanup for removed breaks.
  • Capture one screenshot of TOC verification in exported file.

Release handoff checks

  • Use a versioned export name such as manuscript-cleanup-rc02.
  • Store cleanup notes and screenshots beside the release-candidate export.
  • Only start final print/EPUB QA after cleanup issues are closed.

The Senswriter way (faster)

Use the same workflow in one workspace: draft, export, run checks, fix source, and publish one clean release-candidate file.

Open the Senswriter Workspace and see export examples.

Frequently Asked Questions

Do I need a perfect DOCX before importing?

No. You need stable structure and clean paragraph behavior. The cleanup pass removes most hidden issues before final export QA.

What is the fastest cleanup priority order?

Fix heading hierarchy first, remove spacing hacks second, then clear hidden breaks and direct formatting.

Can I skip cleanup if the first export looks mostly fine?

You can, but hidden DOCX artifacts often reappear during late edits. Cleanup early reduces regression risk in final release week.

Sources and Claim Checks