@ryanscripts
We are unable to reproduce your problem yet.
Please attach the project in which the problem definitely occurs. You can attach it here or in the forum PM.
Is this website really build a protection against scraping?
-
the website is https://www.proxyrotator.com/free-proxy-list/ .
After months mastering BAS (and learning javascript,node,regex,xpath etc...) I became confident there is no website I can't take the information from... until I tried the one above.
If someone knows or has idea how to take the proxies from it, please share.
I'm really curious what solutions (maybe without the obvious one - screenshot and OCR over it) we have in BAS for it.
-
@hungrym said in Is this website really build a protection against scraping?:
the website is https://www.proxyrotator.com/free-proxy-list/ .
After months mastering BAS (and learning javascript,node,regex,xpath etc...) I became confident there is no website I can't take the information from... until I tried the one above.
If someone knows or has idea how to take the proxies from it, please share.
I'm really curious what solutions (maybe without the obvious one - screenshot and OCR over it) we have in BAS for it.What exactly is the problem? I looked at this site, it is simple in my opinion, a little inconvenient to parse, but in General it is not a problem.
-
@usertrue And how exactly? Did you check the source code? It's not possible to copy/paste the proxy (you can just try in normal browser), how about to write BAS script to parse it. I mean what xpath,css, regex you will use to take the full proxy and add it to a list in BAS ?
-
@hungrym said in Is this website really build a protection against scraping?:
And how exactly? Did you check the source code? It's not possible to copy/paste the proxy (you can just try in normal browser), how about to write BAS script to parse it. I mean what xpath,css, regex you will use to take the full proxy and add it to a list in BAS ?
Yes, I looked at the page code, it has everything you need.
-
@hungrym I wrote a js that runs in a browser and collects data. But port comes in the form of base64 pictures of the - think themselves further. There are recognition modules in node js, but I don't have time for that.
{ let proxy = []; let rows = Array.from(document.querySelectorAll('tbody tr:not([class])') ); rows.forEach( row => { let ip = Array.from(row.querySelectorAll('td:nth-of-type(2)>*') ).filter(el=> { let xy = el.getBoundingClientRect(); return el == document.elementFromPoint(xy.x, xy.y); }).map( el => el.textContent).slice(0,-1).join(''); let port = row.querySelectorAll('td:nth-of-type(3)>img')[0].src.split(';')[2]; let loc = row.querySelectorAll('td:nth-of-type(4)')[0].textContent.trim(); let type = row.querySelectorAll('td:nth-of-type(6)')[0].textContent; proxy.push({ip,port,type,loc}); }); JSON.stringify(proxy) }