Cleanup
Cleanup removes unnecessary white space and Word formatting from your document for conversion to XML.
You can access and customise document-specific settings within the Cleanup dialog by navigating to Orion ribbon → Document → Cleanup.
Default settings (JATS)
The following functions are available, and the options below are selected by default for a JATS configuration.
| Field | Description |
|---|---|
| Breaks and White Space | Removes extra spacing in your document. |
| Typographic | Removes optional hyphens and manages soft returns. |
| Replace Character Styles with Local Formatting | Replaces applied character styles (including built-in Word styles and user defined styles) with local formatting |
| Auto-generated Text to Plain Text | Converts automated Word content to plain text. |
| Comments, Bookmarks, and Hidden Text | Manages the removal or handling of nonprinting Microsoft Word features. |
| Graphics | Manages how images, figures, and other graphics are handled in your document. |
| Tables | Automates formatting of all tables. |
How to run Cleanup
To remove unnecessary white space and Word formatting:
Open the Cleanup dialog: Select Cleanup from the Document group of the Orion tab (Orion ribbon → Document → Cleanup).
INFO
The default settings, configured in Typefi Orion Compass, reload every time you open the Cleanup dialog. Any local changes you make apply only to the current cleanup process.
Select the options you want to include. Use the following buttons for quick adjustments:
- Set all: Selects all Cleanup options.
- Clear all: Deselects all settings (removes checkmarks), allowing you to select and run only the cleanup options you need.
- Reset: Reloads all default settings configured in Typefi Orion Compass.
- Click OK to run the selected cleanup options.
TIP
Consistently making changes to the Cleanup dialog for every document? Your Typefi Orion administrator can adjust your settings in Typefi Orion Compass.
Breaks and white space
Breaks and White Space removes extra spacing in your document.
How white space cleanup functions interact
For comprehensive cleanup, Convert Tabs to Spaces runs before Remove Multiple Spaces and Remove Start and End Paragraph Spaces to ensure all extra spacing is standardised and removed.
When selected, the following Cleanup functions will be applied to the entire document:
| Field | Description |
|---|---|
| Breaks and White Space | Use this section to remove extra spacing in your document. |
| Convert Tabs to Spaces | Change tab characters to one space. NOTE: If your document includes tabs for specific formatting (for example, abbreviated lists), you may not want to convert them. Deselect Convert Tabs to Spaces to keep tabs, but be aware that Remove Multiple Spaces will not be able to clean up the resulting whitespace. Convert tabs to spaces is selected by default. |
| Remove Multiple Spaces | Replace all instances of two or more spaces (leading, trailing, or in the middle of the text) with a single space. NOTE: If this option is deselected while Convert Tabs to Spaces is enabled, all tabs will be converted to multiple spaces that will remain in your document. Remove Multiple Spaces is selected by default. |
| Remove Blank Paragraphs | Delete empty paragraphs. Remove Blank Paragraphs is selected by default. |
| Remove Start and End Paragraph Spaces | Remove leading spaces at the beginning of paragraphs, trailing spaces at the end of paragraphs, and all nonbreaking spaces and soft returns. Remove Start and End Paragraph Spaces is selected by default. |
| Remove Column Breaks | Remove column breaks used to enforce layout design in Word. Remove Column Breaks is selected by default. |
| Remove Page Breaks | Remove all manual page breaks. Remove Page Breaks is selected by default. |
| Remove Section Breaks | Remove breaks used for general layout (for example, page numbering) that aren't relevant to XML. Remove Section Breaks is deselected by default. NOTE: If your document includes breaks for structures that require their own page settings, (for example, landscape tables), do not select Remove Section Breaks. |
Typographic
Use the Typographic section of the Cleanup dialog to remove optional hyphens and manage soft returns to prepare for the conversion and export process.
| Field | Description |
|---|---|
| Typographic | Use this section to manage soft returns and remove optional hyphens. |
| Remove Optional Hyphens | Remove discretionary hyphens—invisible typesetting marks that suggest where a word can break at the end of a line. Remove Optional Hyphens is selected by default |
| Soft Returns | Select an option from the drop-down menu to remove forced line breaks that don't begin new paragraphs (often created using Shift + Enter). Options:
|
Replace character styles with local formatting
Character styles
Character styles are a collection of rules that define how your text behaves and looks, such as the font and colour. In contrast to paragraph styles, they are not used for the text formatting of entire paragraphs, but merely to make certain characters, words, phrases, or sentences, stand out. For example, Microsoft Word's built-in character styles, Strong and Emphasis may interfere with Typefi Orion functionality that looks for bold or italic text formatting.
The Replace Character Styles with Local Formatting section of Cleanup replaces applied character styles (including built-in Word styles and user defined styles) with local formatting—formatting applied directly to the text, without using a named style. This removes named styles while preserving the text's visual appearance.
| Field | Description |
|---|---|
| Replace Character Styles with Local Formatting | Use this section to convert Word's built-in character styles and user-defined styles to local formatting. |
| Built-in Styles | This option removes Word's built-in character styles (such as strong, emphasis, and hyperlink) while retaining the local formatting (such as, bold, italic, and underline). Built-in Styles is selected by default. |
| User Defined Styles | Converts custom character styles—any named character style not native to Word (for example, styles from a template or created by the author)—to local formatting. User Defined Syles is selected by default. |
Auto-generated Text to Plain Text
The Auto-generated Text to Plain Text section of the Cleanup dialog converts automatically generated Word content for example, numbers and lists, to plain text.
| Field | Description |
|---|---|
| Auto-generated Text to Plain Text | Use this section to convert automatically generated Word content into plain text. |
| Automatic Numbering | Choose how to handle automatic numbering. Select an option from the drop-down menu to convert automatic numbered lists and headings to plain text. Options:
|
Comments, bookmarks, and hidden text
The Comments, Bookmarks, and Hidden Text section of Cleanup manages the removal or handling of these nonprinting Microsoft Word features.
| Field | Description |
|---|---|
| Comments, Bookmarks, and Hidden Text | Use this section to choose which comments you want removed. |
| Remove Word Comments | Deletes all comments and associated replies in the document. Removing specific reviewers' comments If multiple reviewers left comments, an option will appear in the dropdown menu to select Comments from a specific reviewer.Use this to remove only that reviewer's comments and all associated threads. Remove Word Comments is deselected by default. |
| Remove Bookmarks | Deletes all document bookmarks. Bookmarks used for navigation, can interfere with advanced processes if left in the file. Remove Bookmarks is selected by default. |
| Hidden Text | Detects hidden text in your document. Options:
|
| Track Changes | Choose how tracked changed are handled. Options:
|
Always display hidden text
To configure Word to always display hidden text, follow these steps:
- Navigate to the Display section of the Word Options dialog, (Word ribbon→ File → Option → Display).
- Under "Always show these formatting marks on the screen," select Hidden text.
- Click OK. Now, all hidden text is marked with a dotted underline. Alternatively, you can display hidden text (nonprinting characters) by clicking the pilcrow ¶ icon in the Word ribbon.|
Graphics
Use the Graphics section of the Cleanup dialog to manage how the embedded images, figures, and other graphics are handled in your document.
| Field | Description |
|---|---|
| Graphics | Use this section to manage the embedded images, figures, and other graphics in your document. Options:
|
Graphics placeholder text
Selecting Remove Graphics or Export and Remove Graphics enables a text field. The text field contains placeholder text that replaces the removed graphics in the Word document.
The default placeholder text is [INSERT FIGURE %Z], (where %Z becomes, for example: 001, 002, and 003) as sequential graphics are removed.
Additional placeholder text option:
[INSERT FIGURE %N], (where%Nbecomes, for example:MyDocument) to represent the Name of the Word file (without the.docextension).
Tables
The Tables section of the Cleanup dialog applies automated formatting to all tables in a document. Note: The cleanup process can't apply settings to individual tables.
| Field | Description |
|---|---|
| Tables | Use this section to automate table formatting. All fields are deselected by default. |
| Remove Borders | Removes borders around tables and all cells. |
| Center Columns | Centers table column text. |
| Remove Shading | Removes all cell shading. |
| Add Top/Bottom Borders | Adds a .25 pt. rule to the top and bottom of a table. |
| Left Justify First Column | Applies left-justified formatting to the first column only. |
| AutoFit Contents | Removes width settings in tables and applies AutoFit to all table cells, which adjusts table cell width to fit the size of your content. |
| Add Header Border | Adds a border to the heading row of a table. NOTE: To set up table headers in Word: Select your table → select Table Layout in the Word ribbon → select Repeat Header Rows. |
| Remove First Line Indent | Removes indents from the first line of text in all cells. |