Over the last couple of days I was working on a system that generates HTML emails. The email content is created by loading and rendering ascx controls that perform the value substitutions into a template, and comments had been included in the templates describing what data items each email required. This morning I realized that the comments were being included in the email bodies, so that using ‘View Source’ would display them. Not a huge deal, but it would be better not to have them in there, so I went looking for a method to strip the HTML comments. Everything I found seemed more cumbersome than necessary, so I threw together this simple recursive method to remove HTML comments from a string. It’s naive with respect to the question “what is a comment in HTML?” For my purposes a comment is the text between the <!– and –> tags, and that’s it. Posting this in case someone finds it useful.
private string StripHtmlComments(string html) { int open = html.IndexOf("<!--"); if (open > -1) { int close = html.IndexOf("-->"); if (close > open) { string newHtml = html.Remove(open, (close - open) + 3); return StripHtmlComments(newHtml); } else throw new FormatException("The input HTML contains mismatched comment tags"); } else return html; }
bien, c’est trop cool !