Opened 5 years ago

Last modified 4 years ago

#1877 assigned bug

Generated images not browser cached

Reported by: planetcruiser Owned by: laurentS
Priority: critical Milestone: unassigned
Component: FrameWork Keywords:
Cc:

Description (last modified by planetcruiser)

Issue:

  • All images generated from uploads (avatar, gallery) are sent with aggressive non-caching settings
  • Those images will be downloaded again at *every* single view, even when navigating back to a page you visited seconds before

Solution:

  • Allow caching (remove cache-control)
  • Set expires to 1h or 1d at least

Notes:

  • A typical HTTP header for a generated image looks like this:
    $ curl -I http://bw/gallery/thumbimg?id=96
    HTTP/1.1 200 OK
    Date: Fri, 04 Jan 2013 04:05:42 GMT
    Server: Apache/2.2.22 (Ubuntu)
    X-Powered-By: PHP/5.4.6-1ubuntu1.1
    Set-Cookie: sidTB=Ck833kcI7F25x0WDWStY5nazrR2; path=/
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
    Pragma: no-cache
    Content-Type: image/jpeg
    
  • Static images like logo and design elements are cached properly:
    $ curl -I http://bw/images/logo_index_top.png
    HTTP/1.1 200 OK
    Date: Fri, 04 Jan 2013 04:17:55 GMT
    Server: Apache/2.2.22 (Ubuntu)
    Last-Modified: Mon, 03 Dec 2012 15:03:37 GMT
    ETag: "128ac62-cab-4cff40dd56bdb"
    Accept-Ranges: bytes
    Content-Length: 3243
    Content-Type: image/png
    
  • Rendered images may not be stored on the server properly in some cases (needs confirmation), further increasing server load because they are re-sized at every single view
  • Traffic and server load should be observed before and after fixing this to judge improvement

Priority:

  • Critical because it slows down the site in general and causes lots of extra traffic for the users (think mobile data plans with limited download traffic)

Change History (10)

comment:1 Changed 5 years ago by planetcruiser

  • Description modified (diff)

comment:2 Changed 5 years ago by laurentS

(See mailing list http://lists.bewelcome.org/pipermail/bw-dev-discussion/2013-January/010415.html) patch proposed at https://gitorious.org/~bamse207/bewelcome/bamse207s-rox/commits/1877-cache-image

A good way to improve even further (after doing the same for group thumbnails) would be to have urls for empty avatars/group thumbs all be the same.
What I mean is instead of /members/avatar/someusername, have <img src="/members/avatar/empty"> and cache that pic for a month or a year.

comment:3 Changed 5 years ago by planetcruiser

  • Owner set to laurentS
  • Status changed from new to assigned

looks like laurent is up for it :)

comment:4 follow-up: Changed 5 years ago by planetcruiser

first up: great patch! i didn't even dare to ask for introducing etag. :)

questions:

  • are you just changing the way headers are set or also the way things are written to the disk? it's not clear to me at first glance, because you move around the @copy statement and change its parameters.
  • could you confirm if re-sized images are re-sized at every request? (gallery)

bug:

  • after i update my profile image i see the old one until i press reload. maybe we still need a "must-revalidate"?

code review (when adding new code or modifying existing code):

  • add spaces around dots and after commas (convention)
  • no inline if/ternary operator (reduces readability)
  • add phpdoc blocks for new methods (see ./build/api/ for reference), i see you have used them for _setCachingHeaders()

also see http://trac.bewelcome.org/wiki/ProgrammingGuideline#PHPIII:SyntaxFormatting-KeepitReadable - i know that document is a mess itself, but it may serve as some sort of orientation

further comments:

  • PDataDir::get_etag() is more of a "generateETag()", right? or is it also intended to return a predicable result? if not, maybe add some randomness?
  • there might be a constant for this date format: 'D, d M Y H:i:s \G\M\T' - please check
  • please add a comment for complex if clauses like "(($ifmod || $iftag) && ($ifmod !== false && $iftag !== false))" and explain what it does - or express it in a more readable way ;)

comment:5 in reply to: ↑ 4 ; follow-up: Changed 5 years ago by laurentS

Replying to planetcruiser:

questions:

  • are you just changing the way headers are set or also the way things are written to the disk? it's not clear to me at first glance, because you move around the @copy statement and change its parameters.

just headers. I don't mess with data on disk. For what it's worth, here's some ref I used: http://www.mnot.net/cache_docs/

  • could you confirm if re-sized images are re-sized at every request? (gallery)

profile avatars are not recalculated at every request, I just checked. For gallery files, I don't know, I'll have to check where the code is...

bug:

  • after i update my profile image i see the old one until i press reload. maybe we still need a "must-revalidate"?

Fixed that, I was a bit too greedy yesterday! Please confirm it works.

code review (when adding new code or modifying existing code):

  • add spaces around dots and after commas (convention)
  • no inline if/ternary operator (reduces readability)
  • add phpdoc blocks for new methods (see ./build/api/ for reference), i see you have used them for _setCachingHeaders()

also see http://trac.bewelcome.org/wiki/ProgrammingGuideline#PHPIII:SyntaxFormatting-KeepitReadable - i know that document is a mess itself, but it may serve as some sort of orientation

and who was bitching about codesniffing? ;)

further comments:

  • PDataDir::get_etag() is more of a "generateETag()", right? or is it also intended to return a predicable result? if not, maybe add some randomness?

it needs to be predictable, so that a given file always returns the same etag (we want to know if the file content was changed). Ideally, we'd use md5 checksums, but they're pretty CPU heavy. For now, file modification time (when the user uploaded it) . filesize, which should be unique enough for our purpose.

  • there might be a constant for this date format: 'D, d M Y H:i:s \G\M\T' - please check

Against all odds, it looks like PHP doesn't have a constant for HTTP timestamps (see http://www.w3.org/Protocols/rfc2616/rfc2616-sec3 section 3.3.1 for requirements...)

http://php.net/manual/en/class.datetime.php#datetime.constants.types mentions DATE_RFC1123 but it doesn't even match the RFC! well done for a web language :)

  • please add a comment for complex if clauses like "(($ifmod || $iftag) && ($ifmod !== false && $iftag !== false))" and explain what it does - or express it in a more readable way ;)

fixed, that was crappy indeed.

comment:6 in reply to: ↑ 5 ; follow-up: Changed 5 years ago by planetcruiser

Replying to laurentS:

bug:

  • after i update my profile image i see the old one until i press reload. maybe we still need a "must-revalidate"?

Fixed that, I was a bit too greedy yesterday! Please confirm it works.

i tested it, looks good now. but i was thinking, a must-revalidate still generates 1 header request for each view of the image, right? i wish we could get around that, but suppose we can't.

but your patch improves the current situation by 300% already.

shall we handle gallery images in a separate ticket, or do you want to go for them next as part of this ticket (my favourite, hehe)?

comment:7 in reply to: ↑ 6 ; follow-up: Changed 5 years ago by laurentS

Replying to planetcruiser:

i tested it, looks good now. but i was thinking, a must-revalidate still generates 1 header request for each view of the image, right? i wish we could get around that, but suppose we can't.

if the url remains the same, there's no way for the browser to know whether the server has something new without asking, so no, no way around it.

Then we can start talking cache-busters. Say instead of using

<img src="members/avatar/username?500>

we make it

<img src="members/avatar/username_HASH?500>

with HASH being something unique (the same as the Etag for instance). You change your profile pic, HASH changes, the browser knows without a doubt if it has the file or not.

Now you can tell the browser to cache the image *forever* :)

It's the technique we use at wikiotics for static files (.js, .css, and images), and it's rock solid.

The only trick here is to make sure that *everywhere* we link to the profile image, we do it through the same function, so that the url is built in a unique way.

And we could make /members/avatar/username return a 302 (temp redirect) to the current URL, to ensure users don't get 404s, say with old mail or the odd people bookmarking avatars :)

but your patch improves the current situation by 300% already.

300%? on server load? network traffic? sweet anyway!

shall we handle gallery images in a separate ticket, or do you want to go for them next as part of this ticket (my favourite, hehe)?

It's the same issue, let's do it here...

comment:8 in reply to: ↑ 7 ; follow-up: Changed 5 years ago by planetcruiser

Replying to laurentS:

And we could make /members/avatar/username return a 302 (temp redirect) to the current URL, to ensure users don't get 404s, say with old mail or the odd people bookmarking avatars :)

nice idea, but overkill here i think. let's keep it simple for now?

but your patch improves the current situation by 300% already.

300%? on server load? network traffic? sweet anyway!

all of those things! awesomeness as well. :) this reminds me, how to monitor the improvement? we have no munin installed yet. hm..

shall we handle gallery images in a separate ticket, or do you want to go for them next as part of this ticket (my favourite, hehe)?

It's the same issue, let's do it here...

cool. shall we try to get this into the current release cycle?

comment:9 in reply to: ↑ 8 ; follow-up: Changed 5 years ago by laurentS

Replying to planetcruiser:

Replying to laurentS:

And we could make /members/avatar/username return a 302 (temp redirect) to the current URL, to ensure users don't get 404s, say with old mail or the odd people bookmarking avatars :)

nice idea, but overkill here i think. let's keep it simple for now?

sure, the current scheme works fine I guess.

cool. shall we try to get this into the current release cycle?

Would love to, but as mentioned elsewhere, I'm running out of time, so I'll have to back off here, sorry. Getting the same functionnality on gallery should almost be a matter of copy/pasting the code...

If noone else has done it by the time I come back from the field, I'll be happy to write the cache-busting code.

comment:10 in reply to: ↑ 9 Changed 4 years ago by planetcruiser

Replying to laurentS:

Getting the same functionnality on gallery should almost be a matter of copy/pasting the code...

sweet. do we need a new ticket for this?

If noone else has done it by the time I come back from the field, I'll be happy to write the cache-busting code.

when will you be back? and yes, looks like you'll be the one finishing this. ;)

Note: See TracTickets for help on using tickets.