Putting CF_HTML onto the clipboard

Started by kdmoyers, June 18, 2019, 05:06:38 AM

Previous topic - Next topic

kdmoyers

Say I have some text
Code (winbatch) Select
text = "<div>hello <b>there</b> folks</div>"
and I want to place that on the clipboard as valid CF_HTML content, so that it pastes into a text editor or email program.

Ideally, some sort of fancy ClipPutHtml function would exist, but it doesn't.  What strategy/plan of attack can you suggest?

I did find this url: https://support.microsoft.com/en-us/help/274326/how-to-add-html-code-to-the-clipboard-by-using-visual-basic
but it looks pretty steep for my skill level.  Is that the best route?

TIA,
Kirby
The mind is everything; What you think, you become.

td

There isn't a standard clipboard format specifically for HTML.  I guess the expectation is that it should be treated as plain Unicode or ANSI text. Certainly could create your own clipboard format using a couple of DllCalls and then use BinaryClipPut and BInaryClipGet to place and retrieve your custom format.  However, any application that retrieves the created format will need to be specifically coded to interpret that binary format as it is not one of the Windows built-in formats that most applications are designed to handle.  For example, text editors are written to grab plan text from the clipboard and image editors will grab bitmaps from the clipboard.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

kdmoyers

Hmm... instructive.  Hadn't realized BinaryClipPut was there.  Hmmm...

There's a cool program from NirSoft called InsideClipboard.exe, which is like an XRay machine for the clipboard. Using that, it seems like there's a clipboard format (number 49325) that Chrome makes when you Copy text, which seems to be consumed successfully by a bunch of different programs, old and new.

So, equipped with BinaryClipPut, I think I'll try to create that format, and see if I can crash windows! (laugh)

I'll let you know if I get something useful.

-Kirby
The mind is everything; What you think, you become.

kdmoyers

OK, this hack seems to work. might have bugs.

Since BinaryClipPut wipes the clipboard, I can't do the normal "text format plus special format" thing, but in my use-case, it's fine to just have the one format.

(I'm preparing a short HTML report, so the user can paste it into an email already in progress.)

A cool feature request might be to somehow allow BinaryClipPut to not wipe the clipboard.

-K

Code (winbatch) Select
#definefunction ClipPutHtml(text)
  CFHTMLFORMAT = 49325
  block = $"Version::0.9
StartHTML::<<part A>>
EndHTML::<<part B>>
StartFragment::<<part C>>
EndFragment::<<part D>>
SourceURL::http:://about::blank
<html>
<body>
<!--StartFragment--><<part E>><!--EndFragment-->
</body>
</html>$"
  ; switch to CRLF if needed
  block = strreplace(block, @crlf, '~EOL~')
  block = strreplace(block, @lf, @crlf)
  block = strreplace(block, '~EOL', @crlf)

  partA = strindex(block, '<html>', 1, 0)
  partC = strindex(block, '<<part E>>', 1, 0)

  block = strreplace(block, '<<part E>>', text)

  partB = strlen(block)
  partD = strindex(block, '<!--EndFragment-->', 1, 0)

  block = strreplace(block, '<<part A>>', strfixbytesl(partA,'0',10))
  block = strreplace(block, '<<part B>>', strfixbytesl(partB,'0',10))
  block = strreplace(block, '<<part C>>', strfixbytesl(partC,'0',10))
  block = strreplace(block, '<<part D>>', strfixbytesl(partD,'0',10))

  bb = BinaryAlloc(strlen(block))
  binaryPokeStr(bb,0,block)
  binaryClipPut(bb, CFHTMLFORMAT)
  binaryfree(bb)

  return 1
#endfunction

  ClipPutHtml("hello <strong>there</strong> folks")

  message("OK","try pasting into a window that accepts text cut from browser windows")
The mind is everything; What you think, you become.

td

I am familiar with Nirsoft's clipboard browser.  Windows use to come with one but it disappeared some years ago.  Here is MSFT's official list of clipboard formats defined by the system:

https://docs.microsoft.com/en-us/windows/desktop/dataxchg/standard-clipboard-formats

You will see that 49325 is not among them.  Google or anyone else can create a clipboard format but it will only work with applications that have been built with an awareness of it. 
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

Here's a very old Tech Database article referencing CF_HTML:

https://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/nftechsupt.web+WinBatch/URLs~-~Web~-~Browser~Topics+Reading~CF_HTML~Format~from~Clipboard.txt

It would appear that MSFT created the format as part of its proprietary WebBrowser object model and apparently other software vendors borrowed it.  So while it isn't a standard system format it is something created by MSFT.  MSFT considers the WebBrowser object model "legacy" and doesn't actively support it anymore.  However, they appear to continue to include it in Windows builds.  At least for now.

I can't imagine they would stop supporting CF_HTM.  But MSFT is in the process of splitting the core OS completely from the Windows shell so there can be multiple shells with the same core.  Theoretically, different shells would appear on different platforms but I can imagine a time when the desktop platform could be provided with your choice of shell much like Unix and Linux. So who knows what the future will bring?   
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

kdmoyers

All very interesting.  Explains why the code number 49325 is so high.

It's like a ghost standard --  I just did some testing: Chrome, Firefox and IE all put the same format 49325 on the clipboard when you copy text. I don't have Edge and InsideClipboard.exe on the same machine, or I would test that too.

Welp, in any case, this function seems to work for my parochial little problem here, so I'm happy. 
Thanks for providing BinaryClipPut !

-K
The mind is everything; What you think, you become.

td

Thanks for reminding me of the CF_HTML clipboard format which I had completely forgotten about and FWIW, Edge does support CF_HTML. 
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

kdmoyers

Addl notes to this topic:

This clipboard format 49325 seems to be accepted in a few places, including Gmail, Smartermail, LibreOffice and OpenOffice.  (but not WordPad).  Not sure about MS office products like Outlook or Word.

So it's a handy way for a wbt program to make some nicely formatted stuff (tables and text and images) on the clipboard for users to paste where they can.

One cool thing about the windows clipboard is that it holds multiple formats simultaneously.  So you might want to place a plaintext version and HTML version of the same thing on the clipboard.  However, BinaryClipPut wipes the clipboard, so you can only put  one.

It would be cool if we could have BinaryClipAdd, so that multiple formats could be placed. 

This is really arcane though, so I'm not sure how many folks besides me would ever use it.  :-\
The mind is everything; What you think, you become.

kdmoyers

OK, here's something amazing.  I just my applied windows 7 updates, and rebooted.  Suddenly the HTML-on-the-clipboard feature doesn't work!  It's there, on the clipboard, but none of the browsers or other programs recognize it anymore.  What gives??

So, once again, I clip some data from a web page to see what ends up on the clipboard.  Yup it's there but -- the code number is different!

Before is was 49325.  Now it is 49386.  So I change that constant in my code and everything works again as before.

Wow.  I'd have thought those code numbers were fixed constants.  Not sure how I deal with this!
The mind is everything; What you think, you become.

kdmoyers

I found this:
http://vb.mvps.org/articles/qa200302.asp
which explains that you have to ask for the number because it changes
The mind is everything; What you think, you become.

kdmoyers

This gets it
Code (winbatch) Select
  dllname = StrCat(DirWindows(1),"user32.dll")
  CFHTMLFORMAT = DllCall(dllname,long:"RegisterClipboardFormatA",lpstr:"HTML Format")
The mind is everything; What you think, you become.

kdmoyers

This concludes today's performance of Bonehead Theater.
Try the veal.
The mind is everything; What you think, you become.

td

I guess the late, great Marty had a reason for including the DllCall in his Tech Database example of how to use CF_HTML.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

kdmoyers

That is, of course, where I got it.  Better late than never I guess. 
All my typos are made in tribute to Marty.  ;)
The mind is everything; What you think, you become.