Issues

Issue #5 resolved

S3BotoStorage: set Content-Type header, ACL fixed, use HTTP and disable query auth by default

Wim Leers avatarWim Leers created an issue

This patch has been created as part of my bachelor thesis [1] – as part of the daemon I wrote. I'm using django-storages to send files to FTP, Amazon S3 and Amazon CloudFront.

This patch is for S3BotoStorage and it does four things:

  1. The Content-Type header is set automatically via guessing based on the extension, this is done through mimetypes.guesstype. Right now, no Content-Type is set, and therefor the default binary mimetype is set: application/octet- stream. This causes browsers to download files instead of displaying them.
  2. The ACL now actually gets applied properly to the bucket and to each file that is saved to the bucket.
  3. Currently, URLs are generated with query-based authentication (meaning you'll get ridiculously long URLs) and HTTPS is used instead of HTTP, thereby preventing browsers from caching files. I've disabled query authentication and HTTPS, as this is the most common use case for serving files. This probably should be configurable, but that can be done in a revised patch or a follow-up patch.
  4. It allows you to set custom headers through the constructor (which I really needed for my daemon).

This greatly improves the usability of the S3BotoStorage custom storage system in its most common use case: as a CDN for publicly accessible files.

[1] http://wimleers.com/tags/bachelor-thesis

Comments (15)

  1. stefantalpalaru

    1. Boto already sets the Content-Type header using mimetypes.guess_type() in the SVN trunk: boto/s3/key.py:360

    2. shouldn't we separate bucket level ACL from file level ACL?

    3. those "ridiculously long URLs" are the only way to implement URL expiration and prevent hot-linking.

  2. Wim Leers
    1. Hm weird, I see that now. But it didn't work for me. Probably due to the way S3BotoStorage works, then.
    2. That would indeed be better, but this patch at least makes it better than it was before.
    3. Fair enough, but URL expiration isn't exactly the most common requirement. Hot-linking is though. But from the CDN point-of-view, it makes a bit less sense, but is still a desirable feature. And indeed, this should be configurable.

    What it boils down to, is that my patch is probably too rough to get committed. It needs to expose more of the changes it makes as configurable settings (#2 & #3). #1 should be investigated more closely as to why boto isn't doing it automatically as it is.

  3. Rich Leland

    Regarding item 1 - the content type. You don't need to use mimetypes.guess_type(). The _save method has an argument called content, which has an attribute called content_type.

    So instead of doing this:

    def _save(self, name, content):
        content_type = guess_type(name)[0] or "application/x-octet-stream"
        headers = self.headers
        headers['Content-Type'] = content_type
        
        k = self.bucket.get_key(name)
        if not k:
            k = self.bucket.new_key(name)
        k.set_contents_from_file(content, headers, policy=self.acl)
        return name
    

    You can pull out the content_type like this:

    def _save(self, name, content):
        content_type = content.content_type
        headers = self.headers
        headers['Content-Type'] = content_type
        
        k = self.bucket.get_key(name)
        if not k:
            k = self.bucket.new_key(name)
        k.set_contents_from_file(content, headers, policy=self.acl)
        return name
    
  4. Anonymous

    I would say smaller patch for now - simply to do the content-type part as per the comment above.

  5. Anonymous

    Not all content has a content_type. It is not required by Django.core.files - and in particular it's not an option in ContentFile. This patch on the latest codebase leverages boto for smart content_types based on name.

  6. Log in to comment
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.