Stripping HTML Comments

Over the last couple of days I was working on a system that generates HTML emails. The email content is created by loading and rendering ascx controls that perform the value substitutions into a template, and comments had been included in the templates describing what data items each email required. This morning I realized that the comments were being included in the email bodies, so that using ‘View Source’ would display them. Not a huge deal, but it would be better not to have them in there, so I went looking for a method to strip the HTML comments. Everything I found seemed more cumbersome than necessary, so I threw together this simple recursive method to remove HTML comments from a string. It’s naive with respect to the question “what is a comment in HTML?” For my purposes a comment is the text between the <!– and –> tags, and that’s it. Posting this in case someone finds it useful.

private string StripHtmlComments(string html)
{
	int open = html.IndexOf("<!--"); 	if (open > -1)
	{
		int close = html.IndexOf("-->");
		if (close > open)
		{
			string newHtml = html.Remove(open, (close - open) + 3);
			return StripHtmlComments(newHtml);
		}
		else throw new FormatException("The input HTML contains mismatched comment tags");
	}
	else return html;
}

One thought on “Stripping HTML Comments

Leave a Reply

Your email address will not be published. Required fields are marked *