How to Get Data from Website in Power BI

Cody Schneider

Pulling data directly from a website can unlock powerful, automated reporting in Power BI. Instead of manually downloading CSV files or copying and pasting tables, you can create a live connection to a web page, turning its data into dynamic dashboards. This article will walk you through exactly how to get data from any website, clean it up in the Power Query Editor, and even automate the refresh process so your reports are always up-to-date.

Why Connect Power BI to a Website?

Connecting Power BI directly to a website eliminates the tedious cycle of manual data exports. Imagine you're tracking competitor pricing, monitoring economic data from a government site, or even pulling sports statistics for a fantasy league. Doing this manually means visiting the site repeatedly, copying the data, pasting it into a spreadsheet, and then importing it into Power BI. It's a time-consuming process that results in a static, instantly outdated report.

When you use Power BI’s "Web" connector, you create a refreshable link. Your report stays current by automatically pulling the latest information whenever you schedule a refresh. This approach is perfect for accessing public information presented in structured formats, such as:

  • Financial tables from sites like Yahoo Finance or a stock exchange.

  • Demographic data from sources like Wikipedia or government statistics pages.

  • Product listings or pricing information from e-commerce sites (that display data in simple tables).

  • Public sports, weather, or real estate data.

By connecting to the source, you automate the entire data collection workflow, freeing you to focus on analyzing the insights rather than hunting for the data.

Preparing to Pull Web Data: Finding the Right Source

Before jumping into Power BI, it helps to spend a minute evaluating the website you want to scrape. Power BI's built-in web connector works best with data that is presented in standard HTML tables - think of neatly organized rows and columns you can easily identify on a web page.

What to Look For

The ideal source for this method is a static web page where the data is cleanly laid out inside a <table> HTML tag. A great example of this is almost any data-based list on Wikipedia, like the List of largest companies by revenue. The data is clearly structured in rows and columns, making it easy for Power BI to read.

Keep an eye out for these characteristics:

  • Static URLs: The URL of the page with the data on it should not change frequently.

  • Clear Tables: You should visually see a grid-like table with distinct headers and rows.

  • Simple Web Pages: Websites that are heavy on interactive JavaScript, require a login, or load data dynamically as you scroll can be trickier. While possible to scrape, they often require more advanced methods than the basic "Web" connector.

What to Avoid (or approach differently)

If the data you need doesn't load until you interact with the page (e.g., clicking a "Load More" button), the basic connector might not see it. Likewise, data behind a login wall is generally inaccessible. In these cases, you would typically need a more robust solution, like connecting directly to a database via an API, which is a different process altogether.

For this tutorial, we will focus on the most common use case: pulling data from a publicly accessible web page with visible HTML tables.

Step-by-Step Guide: Connecting Power BI to a Website

Let's walk through an example. We'll pull a table showing the World's 50 Best Restaurants from Wikipedia. The URL for this page is: https://en.wikipedia.org/wiki/The_World%27s_50_Best_Restaurants

Step 1: Get Data from Web

First, open a blank Power BI Desktop file. In the Home tab of the ribbon, click on the Get Data dropdown and select Web.

Step 2: Enter the URL

A dialog box will appear. Simply paste the URL of the website into the text box. Leave the selection on Basic and click OK.

Power BI will now connect to the page and scan its contents for any data structured in HTML tables.

Step 3: Choose Your Data from the Navigator

After a few moments, the Navigator window will appear. On the left side, you'll see a list of all the tables Power BI found on the page. In this case, Wikipedia pages often contain multiple tables, some of which may be for navigation or infoboxes.

To find the right data, you can click on each table name to see a preview on the right. Better yet, click on the Web View tab to see a rendering of the actual webpage. You can then click directly on the table in the Web View to highlight the corresponding data table in the list.

Once you've identified the correct table (in our example, it's named "Winners 2002–2023"), tick the checkbox next to its name.

Step 4: Load or Transform Data

At the bottom right of the Navigator, you'll see two options: Load and Transform Data.

  • Load: This option will load the data directly into your Power BI data model as-is. This is generally not recommended, as web data is almost never perfectly clean.

  • Transform Data: This is the best choice. It opens the Power Query Editor, where you can clean, shape, and prepare the data before loading it into your model.

Always choose Transform Data.

Cleaning and Shaping Your Web Data with Power Query

The Power Query Editor is where the real work of refining your data happens. Web data often comes with imperfections like merged cells, null values, or incorrect data types, and Power Query makes fixing them easy.

Here are some of the most common cleaning steps you'll perform on data scraped from a website:

1. Promote Headers

Often, Power BI will import the table with generic column names (Column1, Column2, etc.), and the actual headers will be in the first row of data. To fix this, go to the Home tab in Power Query and click Use First Row as Headers.

2. Check and Change Data Types

Power Query usually does a good job of guessing the data type for each column, but it's a good idea to double-check. A year should be a Whole Number, a restaurant name should be Text, and so on. Click the icon to the left of the column header (e.g., ABC for text, 123 for number) to change the data type if it's incorrect.

3. Remove Unwanted Columns and Rows

Did the import bring in columns you don't need, like a "Notes" or "Ref." column cluttered with reference links? Simply right-click the column header and select Remove.

You can also filter out unwanted rows. For instance, if the table contains summary rows at the bottom that you don't need, you can use the filter dropdown on a column to exclude them.

4. Tidy Up Text Data

In our example, you may notice that some restaurant locations are linked to their country, appearing like "Copenhagen, Denmark[5]". To clean this:

  • Select the column.

  • Go to the Transform tab and click Split Column > By Delimiter.

  • Choose Custom as the delimiter and enter the opening bracket [. This will split the text into two columns at that point.

  • You can then delete the new column that contains the unwanted remnant (e.g., "5]").

5. Final Review and Load

Once you've applied all your cleaning steps (each of which is recorded under "Applied Steps" on the right), your data should look pristine. When you're happy, click Close & Apply in the top-left corner of the Power Query Editor. Your cleaned data will now be loaded into your Power BI report, ready for building visualizations.

Advanced Trick: Add Table Using Examples

What if the data on a webpage isn't in a clean HTML table? Sometimes you'll find data neatly displayed, but Power BI’s automatic detection fails to identify it as a formal table. For this scenario, Power BI offers a brilliant feature: Add table using examples.

When you're in the Navigator screen, if you don't see your desired data in the list of discovered tables, look for a button at the bottom left called Add table using examples.

Clicking this opens an interactive interface where you can essentially teach Power BI how to extract the data.

  1. An interactive preview of the website is shown in the top pane.

  2. An empty table is shown below that. Simply start typing values from the website into the first row of the empty table.

  3. For instance, start in Column 1 and type the name of the first restaurant. Press Enter. In Column 2, type its location.

  4. As you provide a few examples, Power BI's pattern detection algorithm will kick in and automatically fill out the rest of the column with corresponding data it finds on the page.

  5. Review the suggestions. If they are correct, give your columns proper names, and click OK.

Power BI will now generate a query based on the examples you provided and create a table you can bring into Power Query for further cleaning. This is an incredibly powerful feature for scraping semi-structured data without writing any code.

Final Thoughts

Connecting Power BI directly to online data sources automates your reporting and ensures you're always working with the most current information. By mastering the Web connector and the basic cleaning steps in Power Query, you can transform countless web pages into valuable, refreshable datasets for your dashboards, saving hours of manual work in the process.

While this method is fantastic for public data, connecting to all your internal marketing and sales platforms - like Google Analytics, Salesforce, HubSpot, or Shopify - can be much more complex. This is where tools like Graphed simplify things. We built our tool to instantly connect your core data sources and allow you to build reports using simple, natural language. Instead of wrangling connectors and APIs, you can just ask what you want to see, and a real-time dashboard is created in seconds.