I don't seem to get great results with the httpStripHTML function. Is anyone using anything better? or supplemental?
Input:
notes = "<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<STYLE type=text/css> P, UL, OL, DL, DIR, MENU, PRE { margin: 0 auto;}</STYLE>
<META name=GENERATOR content="MSHTML 10.00.9200.16736"></HEAD>
<BODY leftMargin=1 rightMargin=1 topMargin=1><FONT size=2 face="Segoe UI">
<DIV>
<P class=MsoNormal style="MARGIN: 0in 0in 0pt"><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Segoe UI','sans-serif'; mso-fareast-font-family: 'Times New Roman'">Guys - final poll of GC for practice tonight - all NO's so far. If you are a YES and haven't responded, please TEXT back immediately. Thanks. Steve<?xml:namespace prefix = "o" ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p></SPAN></P></DIV></FONT></BODY></HTML>"
====================
Output:
notes = "
P, UL, OL, DL, DIR, MENU, PRE { margin: 0 auto;}
Guys - final poll of GC for practice tonight - all NO's so far. If you are a YES and haven't responded, please TEXT back immediately. Thanks. Steve"
OK... so it does a lot, but still, I need more. I'm reluctant to start micromanaging every tag this function doesn't clean up... but maybe that's the only way?
You mean it hasn't improved since this morning? ;)
Jim
Use IE's Com and object.innerText?
(there is object.textContent in later versions too I think, but it returns CSS stuff as well I think?? .. look up in Google?)
Quote from: JTaylor on February 05, 2014, 12:13:10 PM
You mean it hasn't improved since this morning? ;)
Jim
Hope springs eternal? :)
I actually was vaguely aware of double-punching something, but I thought it was all in the morning, 2 minutes apart. I think it's safe to delete this one.
Sorry...sometimes I just can't resist :)
Jim