All Things WinBatch > WebBatch

mail scraper /crawler / spider

(1/2) > >>

wolla:
Hello from Germany,

is there a possibility to create a code for crwaling mailadresses from google?

Thank you 4 reply
Wolfgang

td:
WebBatch is a CGI scripting language for Web servers so I am not sure how the connects to crawling email addresses. It is also unclear, to me at least, what you even mean by crawling email addresses.  So perhaps a more detailed explanation of what exactly you are wishing to do is in order.

snowsnowsnow:
Whatever it is, it doesn't sound good.

But I guess, a man's got to eat.

P.S.  In the past, when people posted wanting to do something obviously bad, they would get a talking to and then the thread would die.  Nowadays, that doesn't seem to happen.

wolla:

--- Quote from: td on March 11, 2019, 07:15:24 am ---WebBatch is a CGI scripting language for Web servers so I am not sure how the connects to crawling email addresses. It is also unclear, to me at least, what you even mean by crawling email addresses.  So perhaps a more detailed explanation of what exactly you are wishing to do is in order.

--- End quote ---

For marketing marketing activities I will create a batch to read mail adresses from web-pages.
I will open a browser like google, enter a key word and in the results I will find the mailto:adress from all listed pages.

- The script must open the google results in programming code
- find the www.addresses of the companies
- open their url and find in the code of the mailto:address
- store the address into a file and move to the next entry

Any idea?

Thanks much for all feedback
Wolfgang

td:
As previously mentioned WebBatch is a CGI scripting language for Web servers so it is a tool for creating Web content and not for scraping Web content.  WinBatch, on the other hand, can be used to scan (and scrape) Webpages for specific content.  There are legitimate reasons for performing Web scraping and there are multiple examples of doing this in the Tech Database.    That said, many if not most Websites disguise email addresses to prevent exactly the kind of activity you are proposing.   The reason for this should be obvious. 

Navigation

[0] Message Index

[#] Next page

Go to full version