Azure Functions is a great way to do the things Data Factory can’t. In this example I want to use it to get a Oauth token from Strava, and I want all my secret stuff to be stored in Azure Key Vault. Data Factory can’t lookup values in the Key Vault and build a header for me (as far as I know), but there is a way for Azure Functions to do this (and that will be my next blog post).
So, this should be simple. You have an Azure Functions activity in Data Factory, documented here. By default all Azure Functions is secured with a master key, and I have put this into Key Vault to configure my Function linked service like this (here is a description of linking data factory to key vault):
But is this secure enough? If someone gets your master code they can call the function, without authentication. One way to avoid this is to limit access to the function to certain IPs, but that would require you to set up an integration runtime on a machine or VM with a fixed IP and calling the Function through that. The option I went for was to secure the app by requiring Azure AD authentication.
But then I had the next problem. The Azure Function linked service doesn’t seem to support calling functions with autentication! So, then I had to explore other options. What I ended up with was the REST linked service. Because this can actually have Azure AD authentication, and that with a service principal with secret stored in Key Vault
Next; you cannot use REST as a lookup to just get the token value. But you an use REST as copy. So, I created a copy activity to store the value in a file in my data lake. Next I lookup this file, and set the check for secure output on the lookup (found in the General tab). This makes the result like this is monitor:
Similarly I checked the secure input in my next copy activity, to avoid it from showing in monitor or logs. Then I did this as the source definition:
This made it possible for me to use the token that I got from my secured Azure Function without any secure values showing in monitor or code. Everything is secured with Azure Key Vault. But what do you do with the file stored in data lake with the token in clear text? That is deleted with a delete activity, so it is only there for a few seconds.
My reason for doing this and returning the token that is everything needed to read Strava data is that I now can use data factory for all the different types of API calls, without having to implement everything in my Azure Function. It might be possible to do this in a easier way, but at least I have an Azure Function where calls are authenticated, and the token is not stored unless for a few seconds.