API Aggregation with Django REST Framework

When building a modern web application chances are you're going to be using an API. Backend as a service (BaaS) providers like Firebase are great but sometimes you just need to roll your own. Django REST Framework is a great package to help you do that. You can easily build a simple API and build it up later. Best of all its written in beautiful, tasty Python.

Now, I'm not going to be teaching you how to install and set the basics up because the docs have a great and simple quickstart for that. I will be talking about the many scenarios where you may want to aggregate data from third-party APIs.

Full disclosure: This is actually something I've been doing for a personal project but thought I'd share how I did it.

The scenario

You have a great music taste, right? Not as great as mine but its pretty good. So you want to share the awesome tunes you've found in the depths of the internet. You decide you want to create a single page app (SPA) to do this but first, you need to build your API backend. Rather than rolling your own audio storage and streaming you want to use a third-party. Soundcloud has an easy to use and well-documented API with a Python wrapper so you go with that. Keep in mind that you may want to add other audio providers in the future (especially considering Soundcloud's ever-changing usage policies).

Note: I use Soundcloud here because it's simple. However, with the way this is structured, there is nothing to stop you adding other providers like Mixcloud. Alternatively, you can mix in a local provider as well.

The app

You're super imaginative so you call your app "MyChoons". You need to be able to upload a track to one of the audio providers and see a list of all the tracks you've added.

We'll only cover uploading and retrieving tracks here but it shouldn't be too difficult to add updating and destroying capabilities to your API.

Setup

Top tip: Use a different virtual environment for every Python project. This keeps all of your dependencies separate for each project. A simple guide for this is available here.

I assume you've got Django all set up. If you haven't then follow the quick install guide.

First off, start a new Django project. Skip this if you want to add this to an existing project.

django-admin startproject MyChoons  

Grab those dependencies:

pip install djangorestframework soundcloud  

Then sync your database with python manage.py migrate.

Don't forget to add rest framework to your installed apps! Your settings file should look something like this.

# MyChoons/settings.py

...
INSTALLED_APPS = [  
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',    
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'rest_framework'
]
...

Create the Tracks App:

python manage.py startapp tracks  

Don't forget to add tracks to the settings.py file!

Das Model

Now create your Tracks model:

# tracks/models.py 
from django.db import models


class Track(models.Model):  
    CHOICES = [
        ('SC', 'Soundcloud')
    ]

    provider = models.CharField(max_length=2, choices=CHOICES)
    provider_id = models.CharField(max_length=100, blank=True)
    uploaded = models.BooleanField(default=False)
    created_at = models.DateTimeField(auto_now_add=True)

I said earlier that this setup will allow you to add any provider easily in the future. I.e. this model is provider agnostic. The provider field tells us which provider is used. The provider_id allows us to get the track from the provider's API.

Notice how the provider_id is a CharField and not an IntegerField. This is because some providers (e.g. Mixcloud) use name-based identifiers.

Finally, migrate the changes:

python manage.py makemigrations Tracks  
python manage.py migrate  

Register with Soundcloud

If you haven't already you'll need to register an app with Soundcloud here. You'll need your client ID and secret later.

Time to create that API

Okay, that's the boring foundations down. Time to conjure up the fun stuff.

Serializer

First, we need to set up the serializer. This looks incredibly similar to Django's built in forms API. Except for a couple of key differences:

  • It uses REST Framework's Serializer base class instead of Django's Form.
  • We must declare the create() and update() methods which provide the logic for (you guessed it) creating and updating! Psst! We've skipped the update() method here.
# tracks/serializers.py
from rest_framework import serializers  
from tracks.models import Track  
from tracks.providers import GenericProvider


class TrackSerializer(serializers.Serializer):  
    id = serializers.IntegerField(read_only=True)
    uploaded = serializers.BooleanField(read_only=True)
    created_at = serializers.DateTimeField(read_only=True)

    # ===============
    # Writable fields
    # ===============
    # use the choices in the Track model
    provider = serializers.ChoiceField(choices=Track.CHOICES)
    file = serializers.FileField(write_only=True, 
                                 allow_empty_file=False)
    title = serializers.CharField(write_only=True)
    description = serializers.CharField(write_only=True)

    # ===============
    # Provider fields
    # ===============
    provider_id = serializers.CharField(read_only=True)
    provider_url = serializers.URLField(read_only=True)
    provider_artwork_url = serializers.URLField(read_only=True)
    provider_title = serializers.CharField(read_only=True)
    provider_description = serializers.CharField(read_only=True)
    provider_permalink = serializers.URLField(read_only=True)
    provider_created_at = serializers.CharField(read_only=True)

    def create(self, validated_data):
        # We'll fill this in later with the logic for 
        # uploading our tracks to SoundCloud
        pass

You'll notice that there are a lot of properties here that aren't in the Track model. Weird right? In most cases, your serializer's properties will be exactly the same as your model. However, since we plan on using an external provider for our tracks we'll get many properties from an external API.

You'll see later how this unfolds, but our serializer won't actually be receiving a model instance. Instead, we'll send it a standard python object with all of the relevant properties, both from our model and from the track provider.

Upload to the inter-webs

We said earlier that we wanted to make the app agnostic to which provider the track is uploaded to. For now, we're only going to use Soundcloud but at some point we may want to add more providers. So, we create a GenericProvider class that accepts a Track model instance and uses the correct provider for that track.

# track/providers.py
class GenericProvider(object):  
    def __init__(self, track):
        self.track = track
        self.provider = eval(track.provider + "Provider")

    def upload(self, file, title, description):
        return self.provider(self.track).upload(file, title, 
                                                description)

    def retrieve(self):
        return self.provider(self.track).retrieve()

Now, if we provide it with a Track that we used Soundcloud for as the provider it will try to use the SCProvider class - let's create that.

Note: the short name (SC) is used in the class name rather than the human-readable name (Soundcloud).

# tracks/providers.py

class SCProvider(object):  
    def __init__(self, track):
        self.track = track

        # This is unsafe. 
        # Never include the username and password 
        # like this in production!
        self.client = soundcloud.Client(client_id='YOUR CLIENT ID',
                                   client_secret='YOUR CLIENT SECRET',
                                   username='YOUR USERNAME',
                                   password='YOUR PASSWORD')

    def upload(self, file, 
               title, description):
        track = self.client.post('/tracks', track={
            'title': title,
            'description': description,
            'asset_data': file
        })

        self.track.uploaded = True
        self.track.provider_id = track.id
        self.track.save()

        return self.retrieve()

    def retrieve(self):
        retrieved = self.client.get('/tracks/' 
                                    + str(self.track.provider_id))
        return self._build_track(retrieved)

    def _build_track(self, provider_data):
        return BuiltTrack(
            id = self.track.id,
            provider = self.track.provider,
            provider_id = provider_data.id,
            provider_artwork_url = provider_data.artwork_url,
            provider_title = provider_data.title,
            provider_description = provider_data.description,
            provider_permalink = provider_data.permalink_url,
            provider_created_at = provider_data.created_at,
            uploaded = self.track.uploaded,
            created_at = self.track.created_at
        )

What's this BuiltTrack object I hear you say? Well, that would be the python object that is passed to our serializer. You see, it has all the relevent properties and our serializer will be able to use it to... well... serialize. We should probably write the class, though.

# tracks/providers.py

class BuiltTrack(object):  
    def __init__(self, **kwargs):
        for field in ('id', 'provider', 
                      'provider_id', 'provider_artwork_url', 
                      'provider_title', 'provider_description',
                      'provider_permalink', 'provider_created_at', 
                      'uploaded', 'created_at'):
            setattr(self, field, kwargs.get(field, None))

Important: Currently, your username and password are hard coded. This is okay, since this API is only intended to be used by you and for your account. If you were creating a production app then you would need to follow Soundcloud's oauth flow to get the appropriate token to perform an upload. I recommend using Django allauth to implement authorization with third parties.

Now that we've added the ability to upload tracks we can update our serializer's create() method like so.

# tracks/serializers.py
...

class TrackSerializer(serializers.Serializer):  
    ...

    def create(self, validated_data):
        track = Track.objects.create(
            provider = validated_data.get('provider')
        )
        return GenericProvider(track).upload(
            file=validated_data.get('file'),
            title=validated_data.get('title'),
            description=validated_data.get('description')
        )

Viewsets and routing magic

We're going to be using REST Framework's magical viewsets and routers API to remove a lot of boilerplate code. I'll explain what each does shortly but in the meantime just follow along.

In tracks/views.py set up the viewset:

from rest_framework import status  
from rest_framework import viewsets  
from rest_framework.generics import get_object_or_404  
from rest_framework.response import Response  
from tracks.models import Track  
from tracks.providers import GenericProvider  
from tracks.serializers import TrackSerializer


class TrackViewSet(viewsets.ViewSet):  
    queryset = Track.objects.all()
    serializer_class = TrackSerializer

    def list(self, request):
        queryset = list(Track.objects.all())
        data = []
        for track in queryset:
            retrieved = GenericProvider(track).retrieve()
            data.append(retrieved)
        serializer = TrackSerializer(data, many=True)
        return Response(serializer.data)

    def create(self, request):
        serializer = TrackSerializer(data=request.data)
        if serializer.is_valid():
            serializer.save()
            return Response(serializer.data, 
                            status=status.HTTP_201_CREATED)
        return Response(serializer.errors, 
                        status=status.HTTP_400_BAD_REQUEST)

    def retrieve(self, request, pk=None):
        track = get_object_or_404(Track, id=pk)
        retrieved = GenericProvider(track).retrieve()
        serializer = TrackSerializer(retrieved)
        return Response(serializer.data)

And then in MyChoons/urls.py:

from django.conf.urls import url, include  
from rest_framework.routers import DefaultRouter

from tracks.views import TrackViewSet

router = DefaultRouter()  
router.register(r'tracks', TrackViewSet)

urlpatterns = [  
    url(r'^', include(router.urls))
]

How has this routing magic worked?

TLDR: Basically, viewsets and the router remove a lot of the repetitive work of creating the standard CRUD (create, retrieve, update and destroy) routes for you.

Viewsets combine related views into a single class with minimal logic. Here, we couldn't use the fancy ModelViewSet since we weren't serializing a model. But, if you were to do so you could save yourself tonnes of time by only having to declare the serializer class and the queryset.

A great thing that the viewset allows us to do is to use routers. The router reads a viewset and creates the standard CRUD routes for us automatically. If we were to write update and destroy methods we would get routes like:

  • /tracks (for list & create)
  • /tracks/:pk (for retrieve, update and destroy)

If you don't quite understand this yet then take a look at this great blog post by Xavier Ordoquy.

By Jove, I think we've done it!

If you now run python manage.py runserver and go to http://localhost:8000/tracks you should see the lovely browsable API. After you've uploaded a few tracks it should look something like this:

Track List

We've demonstrated how easy it is to aggregate data from your own database with that from other APIs within Django REST Framework. Thanks to the GenericProvider class it's so easy to add more data providers in the future. Although we didn't implement any update or destroy capabilities, it should be fairly easy to figure out how to do so.

If you have any questions then please let me know in the comments or tweet me @jcbwndsr. This was my first post so let me know what you think! Constructive criticism is always appreciated, just try not to make me cry.

Jacob Windsor

Likes creative computer-based stuff. Web, Science, Typography, Graphics... Trying to combine too many interests into something coherent.

Maastricht, Netherlands