How to read a HTML page from a remote site using VBA/.NET into a Htmlocument


There are a lot of ways to read and parse HTML, the better tricks, don’t use IE itself, since this will deliver automation errors and waste memory.

I’m for 99% of my time into .NET programming, but still, one of my hobbies use an Access 2013 database and thus, a VBA codebase, yummy! And to get powerfeatures, I compiled a tlb to have interfaces like IPersistStreamInit, IStream etc. (it’s called odl compiling and requires  MkTypLib.EXE, not midl.exe!)

Now here is a neat way to fetch/get a plain HTML text and load it into a HTMLDocument without any dependency on IE automation. You’re a smart non-lazy programmer (right?) so you get the idea for C# as well since you need IPersistStreamInit there as well. It’s COM interop, dude!

Public Function HttpGet(ByRef url As String) As mshtml.HTMLDocument
    Dim xmlHttp As MSXML2.ServerXMLHTTP60
    Set xmlHttp = CreateObject("MSXML2.ServerXMLHTTP")
    xmlHttp.Open "GET", url, False
   'set return value
    Set HttpGet = New HTMLDocument
    Dim stream As
    Set stream = CreateObject("ADODB.Stream")
    Dim istrea As IPersistStreamInit
   'get interface IPersistStreamInit from HTMLDocument
    Set istrea = HttpGet
   'write the muke using a binary array (bytes)
    stream.Type = adTypeBinary
    stream.write xmlHttp.responseBody
   'reset stream
    stream.position = 0
    'load the muke into the HTMLDocument
    istrea.Load stream

    Dim s As Single
    s = Timer

   'fake body onload ready
    Do Until Timer - s > 10 Or HttpGet.ReadyState = "complete"

End Function

blog comments powered by Disqus