User Friendly Web Scraper?


#1

Hi All … can anyone recommend a GUI web-scraper that will

  • grab all the links on a page, and
  • follow each link to it’s endpoint or down a couple levels?

I have to put together an information architecture for a large site, and automating at least some of that process would be so helpful.

Nearly everything I’ve found is code-based or doesn’t have the friendliest interface.


#2

Hey Tmc,

I always use import.io for most of my requirements. It’s easy to use and just works…

They also have a detailed version which can customize for deep-scrapping.

All the best!


#3

I haven’t needed to do this in some time, but in the past I used Site Orbiter: http://www.siteorbiter.com/. I’m not sure it’s actively updated, but it might be worth a try.


#4

Thanks for the suggestions Vipul & Maria!

I’ve gotten a couple of recommendations for import.io so I’ll definitely check it out. Site Orbiter looked really promising, but when I tried the download link from their site, it says it’s not available in the U.S. App store :frowning:


#5

Hi All!

I wanted to check back in to both thank everyone for their recommendations, and to let you know that we ended up using Content Insight’s Content Analysis Tool. It doesn’t follow each link to its endpoint or down a couple levels, but it does list all the internal inbound and outbound links on a given page, which is pretty much what my team needed.

Unlike some of the other products I played around with, (import.io, webscraper.io, etc.), it was super easy to figure out and set up. So. Easy. With other products, I kept having to go back and try different configurations to get a usable/useful content scrape, and I still wasn’t able to pull all that I needed.