Over the last couple of days I was working on a system that generates HTML emails. The email content is created by loading and rendering ascx controls that perform the value substitutions into a template, and comments had been included in the templates describing what data items each email required. This morning I realized that the comments were being included in the email bodies, so that using ‘View Source’ would display them. Not a huge deal, but it would be better not to have them in there, so I went looking for a method to strip the HTML comments. Everything I found seemed more cumbersome than necessary, so I threw together this simple recursive method to remove HTML comments from a string. It’s naive with respect to the question “what is a comment in HTML?” For my purposes a comment is the text between the <!– and –> tags, and that’s it. Posting this in case someone finds it useful.
private string StripHtmlComments(string html)
{
int open = html.IndexOf("<!--"); if (open > -1)
{
int close = html.IndexOf("-->");
if (close > open)
{
string newHtml = html.Remove(open, (close - open) + 3);
return StripHtmlComments(newHtml);
}
else throw new FormatException("The input HTML contains mismatched comment tags");
}
else return html;
}
bien, c’est trop cool !