ElyxAI
data manipulation

How to How to Remove HTML Tags from Text in Excel

Shortcut:Ctrl+H
Excel 2016Excel 2019Excel 365Excel Online

Learn how to strip HTML tags from text in Excel using formulas and Find & Replace. This skill is essential when importing web content, cleaning datasets, or working with HTML-formatted data. You'll master multiple methods to extract clean text efficiently.

Why This Matters

Removing HTML tags is critical when cleaning web-scraped data or importing formatted content from web services. It ensures your data is analysis-ready and professional.

Prerequisites

  • Basic understanding of Excel formulas and cell references
  • Familiarity with Find & Replace function (Ctrl+H)
  • Sample data containing HTML tags

Step-by-Step Instructions

1

Prepare your data

Open Excel and paste or import your HTML-tagged text into a column. Ensure the data is in a single column with one entry per row.

2

Open Find & Replace dialog

Press Ctrl+H to open the Find & Replace dialog. This is the fastest method for removing all HTML tags at once.

3

Enable Regular Expressions

Click Options > check 'Use regular expressions' checkbox at the bottom of the Find & Replace dialog.

4

Enter the regex pattern

In the Find field, enter: <[^>]*> (this pattern matches all HTML tags). Leave the Replace field empty.

5

Execute replacement

Click 'Replace All' to remove all HTML tags in one action. Review the results and undo (Ctrl+Z) if needed.

Alternative Methods

SUBSTITUTE formula method

Use nested SUBSTITUTE functions to replace specific HTML tags individually (e.g., =SUBSTITUTE(A1,'<p>','')), but this requires one formula per tag type and is tedious for multiple tag varieties.

Power Query (Get & Transform)

Import data into Power Query (Data > Get & Transform > From Text), apply Replace Values for each tag type, then load back to Excel. Ideal for repetitive cleaning workflows.

Google Sheets REGEXREPLACE

If using Excel Online, copy data to Google Sheets and use =REGEXREPLACE(A1,'<[^>]*>',''), then copy results back to Excel for advanced regex support.

Tips & Tricks

  • Always test Find & Replace on a copy of your data first to avoid accidental loss of important information.
  • Use Ctrl+Z immediately after Replace All if the result is unexpected; undo works best right after the action.
  • For very large datasets, consider filtering to specific columns before using Replace All to control scope.

Pro Tips

  • Combine regex Find & Replace with Find & Replace again to clean extra spaces left behind: find ' {2,}' and replace with a single space.
  • Save your regex pattern (<[^>]*>) in a text file for reuse across multiple projects to save time on routine data cleaning.
  • Use Ctrl+Home before Find & Replace to ensure the search starts from cell A1 and captures all instances.

Troubleshooting

Regular expressions checkbox is grayed out

This occurs in some Excel versions on Mac or when Find & Replace options are restricted. Update Excel to the latest version via File > Account > Update Options, or use Power Query as an alternative.

Some HTML tags remain after Replace All

Check for variations in tag formatting (e.g., <br/>, <BR>, <br >). Run Find & Replace multiple times with different case variations or use a more comprehensive regex pattern like <[^>]+> with the + quantifier.

Extra spaces or line breaks appear after cleaning

Use Find & Replace again to search for multiple spaces (' {2,}' or ' ') and replace with a single space, or search for line breaks (Ctrl+J in Find field) and replace as needed.

Related Excel Formulas

Frequently Asked Questions

Can I remove HTML tags from multiple columns at once?
Yes. Select all columns containing HTML-tagged data before opening Find & Replace (Ctrl+H), then the regex replacement applies to the entire selection. This is efficient for batch cleaning.
What if my HTML tags contain attributes like <div class='text'>?
The regex pattern <[^>]*> handles tags with attributes perfectly because [^>]* matches any character except the closing bracket, so it captures <div class='text'> entirely.
Is there a difference between Find & Replace and formulas for removing HTML tags?
Find & Replace is faster for one-time cleaning and modifies cells directly, while formulas create new cleaned columns without altering originals. Use Find & Replace for speed; use formulas if you need to preserve original data.
Will this method work with encoded HTML entities like &nbsp; or &lt;?
No, the regex pattern removes HTML tags only, not encoded entities. Use additional Find & Replace steps (e.g., find '&nbsp;' replace with space, find '&lt;' replace with '<') to clean entities separately.

This was one task. ElyxAI handles hundreds.

Sign up