Capture URL's home-page as text

Started by fhammer, February 06, 2017, 06:24:21 PM

Previous topic - Next topic

fhammer

I'm looking for straight-forward way to (from within a WIL script) capture output from a URL such as: "https://api.ipify.org"

When I manually go to this site via Chrome, it displays a simple text string in my browser window.

I'm currently using a WIL script which opens the URL in Chrome, then uses the browser's "Save Page As" tool, to capture and retrieve the page as text. This works but is cumbersome, slow and involves a lot of sendkeys, waits, a temporary file, and seemingly-unnecessary displays. I could post the messy scipt, if anyone is interested.

I would like to write a UDF "uGetUrlPageText(url)" that  returns the url's home page as a text string (without a lot of waits, sendkeys, etc). I've looked around the forum and extender help files, however I'm obviously not very knowledgable about this subject.

Could someone point me in the right direction?

Thank you very much.


JTaylor

Various ways to do this.  You might take a look at the WinInet Extender.   Perhaps iReadData or iReadDataBuf might be a place to start.

Jim

td

Depending on the target site, something like the following might be a simple solution:

Code (winbatch) Select
strUrl =  "https://api.ipify.org"

WinHttpReq = ObjectCreate("WinHttp.WinHttpRequest.5.1")
WinHttpReq.Open("GET",strUrl, @FALSE)
WinHttpReq.SetRequestHeader("Accept-Charset", "utf-8")
WinHttpReq.Send()
WinHttpReq.WaitForResponse()
strHtml=WinHttpReq.ResponseText

Message('Response', strHtml)
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

fhammer

I tried the suggested WinInet extender approach(Thank you JTaylor) and wrote a UDF GetURLText(url).


#DEFINEFUNCTION GetURLText(url)

; Returns text (up to 10k characters) from a URL.
; In case of failure an error message is displayed and a null string is

  fcn = "GetURLText"
  urltext = ""

  AddExtender("WWINT44I.DLL")

  If iCheckConn(url) == 0         ; validate url
    Message(fn,"Cannot establish an internet conection to URL %url%")
    Return ""
  EndIf

  maxsize = 10000            ; max size of  returned text

  th = iBegin(0,"","")            ; open top handle (0: direct internat connect)
  dh = iUrlOpen(th,url)            ; open data handle to url

  buf = BinaryAlloc(maxsize)         ; handle to binary buffer
  bufaddr=IntControl (42, buf, 0, 0, 0)      ; size of binary buffer
  BinaryEodSet(buf,maxsize)         ; set EOD marker at end of binary buffer

  n = iReadDataBuf(dh,bufaddr,maxsize)      ; Read text from url
  If n == 0
    Message(fcn,"Unable to retrieve text from %url%")
  Else
    urltext = BinaryPeekStr(buf,0,n)      ; convert relevant binary buffer content to string
  EndIf      

  BinaryFree(buf)
  iClose(dh)               ; close top handle
  iClose(th)               ; close top handle

  Return urltext

#ENDFUNCTION

; ------------------------------------------------------------------------------------
; Test Program

fcn = "GetURLText Utility"

u = AskLine(fcn,"Enter URL","https://api.ipify.org")

text = GetURLText(u)
textlen = StrLen(text)
ClipPut(text)               ; put entire original unformatted text in clipboard

; Special substitutions for display of first <= 500 characters

text = StrSub(text,1,500)         ; truncate text to 500 bytes orless
text = StrReplace(text,@CR,Num2Char(169))   ; * replace CR's with Ã,© characters
text = StrReplace(text,@LF,Num2Char(172))   ; * replace LF's with Ã,¬ characters
text = StrReplace(text,@TAB,Num2Char(187))   ; * replace tabs with Ã,» characters

; Construct display message

msg = StrCat(textlen," bytes of text returned from URL ",u)
msg = StrCat(msg,@CRLF,"First <= 500 bytes (with substitutions: CR -> Ã,©, LF -> Ã,¬, tab -> Ã,») = ",@CRLF,@CRLF,"[")
msg = StrCat(msg,text)
If textlen > 500 Then msg = StrCat(msg," ... ")
msg = StrCat(msg,"]",@CRLF,@CRLF,"Complete text (without subsitutions) is in Clipboard.")

Message(fcn,msg)            ; Issue message and exit

Exit

----------------------------------------------------------------------
A related question:

Does anyone know where I can find other public free URL's, like "https://api.ipify.org", which perform simple services? For example, there should be a public URL/site that returns the current UTC/GMT (time) as a text string.