How do you feel about scrapers?

Do they feel a bit too techy for you? Like they’re not worth the effort. Well, you might reconsider that if you keep on reading.

Do you remember the times we had no internet connection?

Yes, there were such times late millennials. 🙂

There were times when we had to do our research in the library, when we had to send actual mail from the post office, and when our only option to communicate with the world was to wait or to make a phone call.

And, yes, we had to use a landline if we had to make a call.

So we weren’t available 24/7 to everybody.

Did our generation survive these modern dark ages?

Yes, they surely did.

Did we end up being successful (at least some of us)?

Yes, we did.

Did businesses fall apart due to the lack of internet connection?

No, they didn’t.

Some of them did, but it surely wasn’t because we had no internet.

Imagine living in such an era.

And on top of that, imagine surviving it, graduating, getting a masters or even a PHD, starting a business and growing it.

And we did just fine.

A big chunk of the great human progress and scientific breakthroughs came before the era of the Internet. While the eras before ours had the disadvantage of not being able to communicate at the speed of light, they also had reduced distractions caused by information overload.

They didn’t have to spend 3 hours a day staring at some random stranger’s lunch photo or pictures of kittens.

There is nothing important that we do that was impossible before the arrival of the Internet.

Everything that we do now had their “slower” equivalents.

Libraries, newspapers and bookstores managed the functionality of the web.

While Google has indeed brought libraries at the speed of light, it has also made us slackers in storing essential bits in our brains. Between our imagination and finely bound books, a whole range of innovations were built.

The bottleneck when it comes to innovation is our brain and not the communication channels.

Now, everything we need is just a click away.

And, yet, we somehow manage to miscommunicate with each other.

That’s the easiest thing in this world, but for me, miscommunication means damaging my business, my employees, my reputation and myself as a person.

So I am really careful when it comes to communicating, delivering the right message and listening and understanding my clients’ needs.

I learned not to assume, but to ask, then calculate and measure my chances and track my growth.

However, I don’t do all this alone.

I have a few tricks in my hand, one of which is data scrapers.

What are data scrapers?

As you may have read in one of our past blogs, data scrapers are your best pals when it comes to exerting valuable data from the Internet, building a meaningful relationship with your clients and taking your outreach game to leap and bound.

However, I am well aware that the data extraction process can be complex and demanding, but with the right data scraping tools under your belt, you’ll be just a few clicks away from obtaining high-quality data in no time.

I have worn your shoes and walked many miles in them to be where I am now.

The sole use of data scrapers won’t take your outreach game to the finals, you’ll still have to bounce the ball yourself, but it will surely make you propel forward in the outreach court.

Hey, nobody said it was easy, but I’m sure that you are immensely aware of that now.

Finding the right data is the bridge between your blank Excel Document and increasing your sales and/or productivity. The modern-day internet is more used than medical antiseptic gels (this past year) – people produce a staggering 2.5 quintillion bytes of data on a daily level. Whether you’re just about to launch your dream project or you’ve owned your business for decades, the information found in data is what helps you draw potential customers away from your competitors and keep them coming back. Data scraping, or extracting useful data from the Internet and converting it into a useful format (like a spreadsheet), is a key component to quickly building a solid B2B database.

Web data tells you almost everything you need to know about those clients, from the average prices they’re paying to the must-have features of the moment. But not every SME has the time or budget to spend hours on manually extracting and validating data. That’s where web or data scraping tools come in, and the process can be extremely intimidating.

It is difficult to say exactly which factors should be taken into consideration when choosing a suitable data scraping tool. Of course, different users have very different needs, and there are many tools out there for all of them.

In the previous blog on the data scrapers we mentioned earlier, we already gave you a list of 34 data scrapers that we’re sure came in handy.

We believe you read it and you loved it and that now you have the right amount of knowledge you need on data exertion and building some serious solid databases.

The method

There are tons of data scarpers out there and it will take months if I start to evaluate each and every one of them, since it’s, for starters, practically impossible to compare them. Each tool has its own unique benefits, and depending on your needs, brings a different value.

But in order to show you how you can use these tools, I decided to pick one tool and go through the B2B database building process. The tool I used is Snov.io. I chose Snov.io not because I am in any way related to it, nor do I have an affiliate link that I can share. It is purely to show how scrappers can speed up the entire prospecting process.

Snov.io in particular is a great lead generation and outreach tool for both B2B companies and one-man-army sales and marketing specialists, but in addition to that, it can be used in so many other ways.

But enough chit-chat, let’s get down to business.

To prove you that I am not all-words-no-action kinda guy, I will present you three case studies. That means three clients, three different kinds of problems, three different solutions and just one data scraping tool. You can always choose the one that you prefer and see if it will produce different results.

The three cases I will be presenting are real client cases, so in order to protect their privacy I won’t use their real names, but the solution that I will show you is 100% real.  I chose these three examples, as I believe they are quite different and depict the variety of the approaches that I tend to use.

Fingers crossed for you to identify with one of them.

Case 1: An existing database of people

Some time ago, I had a client that developed an AI-based SaaS interfaced with existing customer service management technologies. Their services were focused on taking customer service to the next level by increasing agents’ productivity with task automation and recommendations.

They used the science of data to deliver valuable insights to management teams.  When I say valuable insights, I mean informing the management teams on the dissatisfaction of their customers and the kind of assistance they are in need of.

This client has already had an existing list of companies and positions, but no e-mail addresses.

They wanted to start a cold email sequence, but as you may imagine, without the e-mails, that was quite impossible.

That’s when we came on stage.

Here’s the list that we exerted from the original one that we got. We’ll showcase 197 of the total 3500 people for the purposes of this study not converting into a book.

An existing database of prospects

Now let’s show you the BizzBee magic.

  1. Save a copy of the excel file. It’s not smart to work on the original spreadsheet.
  2. For this task, I used Snov.io’s Emails from Names tool. I uploaded a CSV file, with the person’s name, last name and job title.
  3. In the excel sheet, I deleted all the other columns and renamed them to match the requirement by Snov.io. As such, I saved it as a separate CSV file to the original file.
  4. I uploaded the file in Snov.io, and it informed me that it would cost me 198 credits if they found an email for each contact – which was acceptable and I continued with the upload.
Snovio example

5. After 10 seconds, the list was ready to be opened. I got the following report:

a. 4 – contacts with no e-mails
b. 87 – contacts with valid e-mails
c. 75 – contacts with e-mails that cannot be confirmed
d. 26 – contacts with invalid e-mails

6. The next step is to export the excel file into an “XLSX” format and choose which fields you need. For this project, I only needed email, email validity, first name, last name, full name and user social (LinkedIn profile). You can then download the excel file from Snov.io.

7. By copying the data from Snov.io in a separate sheet in the original Excel in a “Sheet 2”, errors in mixing-up both Excel files can be avoided.  

8. You will notice that Snov.io and the original excel have different rows. Here what you’ll need to do is to:

a. Exert the first name and last name from the original excel
b. Search for them in the second sheet – to find a matching e-mail
c. When you find it, update the original sheet
d. A few more steps are required to do this process automatically

9. In a separate column in sheet 1, combine the names and last names. To avoid mistakes use the formula CONCATENATE (text1, [text2]…).

Using formula in the Snovio scraper

10. The e-mail column needs to come right after the Name and Surname column in the Snov.io sheet. You’ll need this for the next formula.

11. The VLOOKUP formula will be looking for the same full name from sheet 1 in sheet 2. Upon finding it, it will copy the e-mail from that row.

Using formulas in scrapers

12. By simply dragging the email column, the same formula will be applied to all the other rows by a simple drag of the e-mail column. This way the valid rows will be filled with e-mail addresses.

13. If we want to add the Verify status – we need to copy that column (in sheet 2), right after “full name” and “e-mail.”  This field can also be filled by using the VLOOKUP formula.

14. The next, smart step will be to copy the “E-mail” and “validity” columns and paste them by pressing “Paste values only” – so that the data won’t be altered after you delete Sheet 2.

15. There are 20 out of 200 contacts that Snov.io could not find an e-mail for. What you can do here is to filter out all the rows that don’t have an e-mail or have a #N/A status.

  • No Domain (9): Row 9, 12, 19, 21, 28, 36, 158, 171, 188 – which is why Snov.io could not find their e-mail.
  • Weird name/surname (7): Row 35, 50, 53, 55, 92, 166, 173 – the name or the last name contains an unusual letter that Snov.io can’t recognize.
  • Don’t have an apparent reason (4): Row 164, 165, 184, 193 – these seem ok, but   Snov.io did not process them.
  • This comprises only 10% out of the 200 leads. I tried googling the name, last name, and position and found their LinkedIn profile. Then I used Snov.io Chrome Extension – and got their e-mails. This worked on 18 out of the 20 e-mails, and it took me less than 15 min.

16. Now that the automation part is finished, we need to bring everything back to its original state.  After returning back the columns to the original sheet, I deleted sheet 2 and just copied the e-mails we’ve found to Column B – which was the original place in the client’s excel sheet – and then removed all the extra columns we’ve added.

The database is completed.

17. Next, I took the CSV format of the sheet and put it on bulk verify. When the Bulk E-mail checker finished, we got the following results:

a. Passed: 76
b. Failed: 36
c. Unknown: 84
d. Total: 196

The final statistics.

18. The last step is to try to manually replace the failed emails. We managed to get 160 (81.6% automation), out of the original 196 prospects with no e-mails. We sourced the rest 36 manually.

Case 2: Leads for catering and food ordering company

This next client, with whom I had the pleasure to work, was a Michelin star chef and a business-savvy entrepreneur who had opened his own food-delivering company.

Unlike other companies of this type, this company had their own kitchen where fresh, healthy food was prepared and then delivered to other companies.

For the purposes of this project, we had to find potential clients in Hamburg, Germany. We targeted 500 companies with 50+ (preferably 100+) employees.

The ideal targets were people who were responsible for food ordering or event organizing.

Our client already had a database of 1893 contacts who needed to be excluded from this prospecting process.

This was our biggest challenge.

Finding companies in Hamburg that our client doesn’t already have in their database.

Our solution was the following:

Option 1 – LinkedIn Scrapping

1.I typed Hamburg and chose the 50-200 employee filter on LinkedIn Sales Navigator. The results I got were a total of 842 companies.

2. By clicking “Select all” – you will select all 25 companies on the first page. If you go to “view current employees” you will switch to lead search. By clicking on company, you can see that all 25 companies from this page are pre-filled.

LinkedIn Sales Navigator filters

3. The next step is to add all the positions. Based on the requirement I added the following: assistant, community manager, event manager, front desk, HR manager, office manager, project manager and got 158 relevant positions.

4. By using the Snov.io Chrome extension, I downloaded all these 158 people; I set up the start page at 1 and the end page at 7 (as I had 158 people) and kept the pre-defined timeout values. Once it started, Snov.io collected that info in the background.

Snovio scraper

5. Next, you need to repeat steps 2-4 for the next pages as you did for page 1. That’s how we collect the leads from all companies.

6. Once you are finished you can download your list from Snov.io. You can find it under ‘Prospects’. From now on, the process becomes manual.

Downloadable lost from scrapers - Snovio

Option 2 – Snov.io platform

1.This option allows you to work directly from Snov.io by using the Snov.io Company Profile Search feature.

2. In the filters, after you select Germany, Hamburg, 51-200 employees, a list of 609 companies that match the criteria pops out. Then click the ‘minus’ option which is left from ‘name’  in order to give you the options to select all and save the list.

Company profile search on scrapers - Snovio

3. Then select ‘Businesses’ on the main menu and on the left, you will see the companies’ list, a bunch of additional data, as well as an option to export the list to CSV.

List of filters on scrapers

4. You can use the Bulk Domain search feature in Snov.io to look for certain positions from the company list. You can upload the list or choose from companies list. Then it’s up to you to search for prospects or e-mails, define positions, country and start the search.

5. Snov.io’s job is to find these positions. In my case, we found 104 people from 104 domains. All from different companies. You can a couple of people from the same company and see if you are happy with the results.

6. Once you add them to a list, it will appear on the “Prospects” tab in Snov.io, ready to download.

7. From here on the process is manual.

Case 3: SEO offering

The third client is a digital marketing agency that helps other companies generate more traffic by creating content that matches the user’s intent.

They don’t rely on data from third parties, and they’ve been studying search and content trends since 2005 − compiling the industry’s largest global and historical database.

The client’s main goal is to uncover the opportunities and pitfalls of online marketing. Their award-winning products bring search engine optimisation and content creation together for the first time, offering marketers an ultimate platform for creating the moments that shape customers’ decisions and brand preferences.

For the purposes of this project, we were asked to target companies from the e-commerce, digital marketing field as well as enterprises in Germany.

The number of employees within the e-commerce sector was irrelevant and in the digital marketing sector, we were asked to target companies with 50-200 employees and enterprises with more than 1000 employees.

We were to target CMO /Head of / Director Marketing, E-Commerce, Digital, SEO or Content positions.

So the following steps need to be undertaken when looking for e-commerce sites:

1. With the help of the Snov.io Technology checker feature you can find companies based on the technology they use. When it comes to e-commerce Snov.io gives you a choice of 140+ e-commerce technologies. You can select the 10 first e-commerce technologies.

Snovio Technology Checker - using scrapers in prospecting

2. When you click search, its first result is limited to 100, but it gives you some more advanced filters – industry, employee size, country, language. As we don’t have any limitations (except Germany) – I’ve set the country and increased the limit to 5.000 websites per search.

technology checker - using scrapers in prospecting

3. Snov.io identified 2.899 websites that matched the criteria and only four of them matched the technologies (this needs to be tested further). From here on, you can find the desired positions by clicking the ‘Find people to contact’ button. What I did now was to deduct the websites from 5.000 to 100 and limited them to Shopify only. In this way, I got 95 companies which I then exported.

4. I got this:

Finished data by using scrapers

An Excel sheet with no companies’ names, but URLs.

5. Then I used this list to search for positions (upload as CSV) or to check the ‘Find people to contacts’ section derived directly from the search results. Based on the requirement I’d set up to look for specific positions.

Scrapers scheme

6. The companies were relevant, but I needed to search for the leads manually. The alternative was to play a little with the roles’ names until a better number pops out.

Conclusion

Phew, it took me longer than I thought to gather up all this data in one place and to put it in words.

Well, to be honest, the most difficult part was to try not to make a book out of it and not bore you with too many details. But also give you a plain and simple guide so you can do it yourself.

Building B2B databases is not as simple as ABC, especially if you’ve been given the task of automating the database building process.

I’ll take a wild guess and assume that you’re probably not overly excited about it. Especially if it’s something you’ve got to build from scratch.

But with the right tools and guidelines in hand, it shouldn’t be as exhausting and as boring.

Time is such a valuable thing we were given, it would be a shame to waste it on manually searching for data that can be scraped.

So I hope I helped even the slightest bit with this guideline on using scrapers to build B2B databases.

I am aware that no matter how much I try or how many case studies I present to you, there’s no right or ideal way to build a solid prospecting base that will last forever. Data changes in the blink of an eye, and you need to regularly update your database.

Nowadays people are even more prone to changing their jobs. So, all the data can be outdated quickly.

But finding the ideal prospect is a complicated and time-consuming process. According to statistics, over a third of people believe that prospecting and lead qualification is the biggest challenge in 2021. More than 75% of generated leads never convert into sales, and the most common reason for this drop-off rate is a lack of lead nurturing, the process of developing relationships with prospects by providing them with ongoing value and information.

Half of the respondents note that they aren’t using sales technology and automation to make the lead qualification process more efficient—thus spending a large portion of their workday prospecting and qualifying leads.

Conversely, business people that do use lead generation tools are 13% more likely to consider themselves successful at sales and 7% more likely to hit their annual sales targets.

An effective B2B sales prospecting process, aided by sales technology, can help you streamline your prospecting efforts so that you can focus on selling and landing new business.

But whatever you do, don’t jump right into automation, if you don’t know how to manually build prospect databases.  As with everything in life, you need to have solid base that you can build on. How can you automate a process if you don’t know how it works and consider all the variables that can go wrong?