Author Topic: mail scraper /crawler / spider  (Read 634 times)

wolla

  • Newbie
  • *
  • Posts: 3
mail scraper /crawler / spider
« on: March 11, 2019, 03:46:50 am »
Hello from Germany,

is there a possibility to create a code for crwaling mailadresses from google?

Thank you 4 reply
Wolfgang

td

  • Tech Support
  • *****
  • Posts: 3079
    • WinBatch
Re: mail scraper /crawler / spider
« Reply #1 on: March 11, 2019, 07:15:24 am »
WebBatch is a CGI scripting language for Web servers so I am not sure how the connects to crawling email addresses. It is also unclear, to me at least, what you even mean by crawling email addresses.  So perhaps a more detailed explanation of what exactly you are wishing to do is in order.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

snowsnowsnow

  • Sr. Member
  • ****
  • Posts: 275
Re: mail scraper /crawler / spider
« Reply #2 on: March 11, 2019, 07:28:17 am »
Whatever it is, it doesn't sound good.

But I guess, a man's got to eat.

P.S.  In the past, when people posted wanting to do something obviously bad, they would get a talking to and then the thread would die.  Nowadays, that doesn't seem to happen.

wolla

  • Newbie
  • *
  • Posts: 3
Re: mail scraper /crawler / spider
« Reply #3 on: March 11, 2019, 08:13:25 am »
WebBatch is a CGI scripting language for Web servers so I am not sure how the connects to crawling email addresses. It is also unclear, to me at least, what you even mean by crawling email addresses.  So perhaps a more detailed explanation of what exactly you are wishing to do is in order.

For marketing marketing activities I will create a batch to read mail adresses from web-pages.
I will open a browser like google, enter a key word and in the results I will find the mailto:adress from all listed pages.

- The script must open the google results in programming code
- find the www.addresses of the companies
- open their url and find in the code of the mailto:address
- store the address into a file and move to the next entry

Any idea?

Thanks much for all feedback
Wolfgang

td

  • Tech Support
  • *****
  • Posts: 3079
    • WinBatch
Re: mail scraper /crawler / spider
« Reply #4 on: March 11, 2019, 01:35:28 pm »
As previously mentioned WebBatch is a CGI scripting language for Web servers so it is a tool for creating Web content and not for scraping Web content.  WinBatch, on the other hand, can be used to scan (and scrape) Webpages for specific content.  There are legitimate reasons for performing Web scraping and there are multiple examples of doing this in the Tech Database.    That said, many if not most Websites disguise email addresses to prevent exactly the kind of activity you are proposing.   The reason for this should be obvious. 
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

  • Tech Support
  • *****
  • Posts: 3079
    • WinBatch
Re: mail scraper /crawler / spider
« Reply #5 on: March 11, 2019, 01:40:26 pm »
Whatever it is, it doesn't sound good.

But I guess, a man's got to eat.

P.S.  In the past, when people posted wanting to do something obviously bad, they would get a talking to and then the thread would die.  Nowadays, that doesn't seem to happen.

Actually, it does still happen.  It is just that we prefer not to rush to judgment.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

stanl

  • Pundit
  • *****
  • Posts: 948
Re: mail scraper /crawler / spider
« Reply #6 on: March 11, 2019, 02:44:14 pm »
It is just that we prefer not to rush to judgment.


Could probably work with straight WB webscraping. OP needs to provide keyword(s) as an example so one cold see what is returned and how a mailto address is set up <href:> or something similar.

wolla

  • Newbie
  • *
  • Posts: 3
Re: mail scraper /crawler / spider
« Reply #7 on: March 12, 2019, 04:08:07 am »
Thanks all for repl.
at the end web scraping is allowed....
Regards
Wolfgang