Refresh Power BI datasets with Data Factory

This post gives you everything you need to set up a pipeline in Data Factory so that your Power BI datasets can be refreshed as soon as the data is available. It builds on this post where we have set up some Azure Functions to use the Power BI APIs.

The obvious reason for setting up this pipeline is to ensure updates to Power BI datasets only when data is updated. Another reason is that using scheduled refresh has one major drawback; you don’t have reruns! All it takes is a bit of connection issues and you will have to go into the Power BI portal and do a manual refresh of the dataset. With this pipeline those days are gone.

This pipeline has two parameters; groupName and datasetName. These have to be the exact (case sensitive) names of you Power BI workspace names and dataspace names. Next you. must create a linked service to your Azure Function (as in the link above) so that you can. use Azure Functions activities.

Finding dataset id and group id

The first step of this pipeline is to run the function for finding groupId and datasetId based on the pipeline parameters. The reasons for having this step is so that you can check the monitor and easily see which workspace and dataset you attempted to refresh. The results of the functions is stored in two pipeline variables for easy reuse.

First part of the pipeline, finding groupId and datasetId

Run the first refresh

Next you will run the refreshDataset function to trigger a refresh, and then have an until loop to check the status. The reason for this is that an Azure function activity will time out after 230 seconds regardless of the timeout period set.I therefor use an until with a 1 minute wait task between checks. The criteria for refresh is that status is Completed or failed, and that the StartTime of the refresh is later than the pipeline trigger time (to ensure that we don’t get an earlier refresh).

Refresh, and then wait for it to complete

Rerun if failure

After this I add an If-activity to check if the status was failed. In that case I run a second refresh, and a new until. Then new until has a similar expression, with some modifications:

@or(and(
or(
equals(activity(‘GetRefreshHistory_2’).output.status, ‘Completed’), equals(activity(‘GetRefreshHistory_2’).output.status, ‘Failed’)
),
greater(activity(‘GetRefreshHistory_2’).output.startTime, activity(‘GetRefreshHistory_1’).output.startTime)),
equals(activity(‘GetRefreshHistory_1’).output.status, ‘Completed’))

The second criteria is that either the second refresh should be finished (and have started after the first), or the first should have succeeded

Options if second failure

If the refresh fails a second time you have some options. One thing you can do is send an email notification with a logic app (if you really want to shine you use the Power BI API to also collect the workspace members/admins and notify them), or you can make the pipeline fail and use the monitoring you have in place for that. One way to make that happen is to add an Azure Function App activity that calls a function that doesn’t exists. This can be done in an if activity.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.