Closed Bug 390341 Opened 17 years ago Closed 14 years ago

support gzip encoding and content-length for Tinderbox showlog.cgi

Categories

(Webtools Graveyard :: Tinderbox, enhancement)

x86
Linux
enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rhelmer, Assigned: rhelmer)

References

()

Details

Attachments

(9 obsolete files)

Supporting HTTP gzip encoding would reduce transfer sizes, and probably greatly increase perceived transfer time for most users. 

I think that this would be especially pronounced when sending gzipped Tinderbox logs, as they would not have to be uncompressed on the server side before sending.
(In reply to comment #0)
> Supporting HTTP gzip encoding would reduce transfer sizes, and probably greatly
> increase perceived transfer time for most users. 

Er, should be greatly *reduce* perceived time :)
I took a look at showlog.cgi tonight, which is where I think the biggest use for this feature would be.

Here is what currently happens:

Build logfile (received via email) is compressed and stored in tree directory. Every time a request is made to showlog.cgi, the compressed logfile is read in it's entirety and either a "brief" or "complete" HTML page is generated.

If we instead generated and stored the "brief" and "complete" HTML pages as gzipped HTML, we simply write the contents of that file to the user agent if "Content-encoding: gzip" is set. This could be a huge win in server processing time as well as bandwidth and load time for the client.

Even if the user agent doesn't support gzip encoding (I can't think of very many nowadays that don't, but it can be checked using the client-side Accept-encoding header), it'd still take less time to uncompress these pages than to rebuild the report each time.
Summary: support gzip encoding for Tinderbox → support gzip encoding and content-length for Tinderbox showlogs.cgi
Assignee: morgamic → robert
I hacked this up locally and it seems doable, for the most part. I believe that all of the dynamic bits in showlog.cgi could be moved to helper functions which are called when the original incoming build log email is processed (it's all printed directly to stdout right now as it's processed), with the exception of notes (which are entered when a build is "starred").

Notes go into the "build comments", here is an example:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274159009.1274159743.31762.gz

There are ways to enable the feature without preserving the current output (using ajax, iframe, etc.), but these seem like a much bigger task. I am assuming that having build notes on these pages is a useful feature that no one wants to sacrifice.

My only idea so far is to uncompress the first few lines of the report, insert the notes, then compress this and append the rest of the file. I think it should be possible to determine what the content-length will be, so at least the rest of the file could be printed to stdout as it is read.

Maybe someone else can correct the above, or has a better idea?
Summary: support gzip encoding and content-length for Tinderbox showlogs.cgi → support gzip encoding and content-length for Tinderbox showlog.cgi
Can you just regenerate the cached HTML when build notes are added? Might take a second or two, but still better than what we're doing now.

(And yeah, we can't live without being able to star builds, presently.)
(In reply to comment #5)
> Can you just regenerate the cached HTML when build notes are added? Might take
> a second or two, but still better than what we're doing now.

Yes, I think this would work fine. Thanks!

One thing I just noticed is that there already is a cache in place for the "brief" log.

The README says that this file gets updated when the log does, but my read of the source indicates that it gets created the first time the brief log is loaded:

http://mxr.mozilla.org/mozilla/source/webtools/tinderbox/showlog.cgi#112

The "Build Error Summary" and following sections are cached; notes and links above that are done dynamically at request-time, so this approach would have the same problem with regard to gzipping that I was stumbling on in comment 4. 

I think that this older approach should be removed, and the full and brief files should be gzipped and updated when the log is updated or when notes are added.
Attached patch PoC gzip-encoding for brief/full log reports (obsolete) — — Splinter Review
Here is a working proof-of-concept for the approach we discussed in IRC. 
This does not implement caching (yet), but does the following:

* send back gzipped content if accept-encoding request header contains gzip
* remove global dependencies from all functions in showlog.cgi and move the functions to tbglobals.pl (so they can be used by other scripts, to build the cache when email or notes come in)
* remove print_fragment (I can't figure out how to get it working, am I wrong here? I've never seen this feature used)
* remove print_log_section (looks like dead code, is not called from anywhere)
* remove existing brief.html cache mechanism
* comment out EmitHtmlTitleAndHeader until we decide what approach to take for output

This makes showlog.cgi much simpler, although there's still room for improvement. I am not sure if tbglobals.pl is the right place to move all this, but it seems to be the place for sharing code between different parts of tinderbox.

I am wondering what you think about the general idea of passing around a filehandle that might be STDOUT or might be gzip (which may be writing to STDOUT or a file). This same approach could be used in other parts of Tinderbox, if it makes sense. In the case where we do have a cache we should be able to just read the file and write directly to STDOUT.

Finally, I think we should construct the cache at request time if it does not exist for some reason (e.g. previous version of Tinderbox was in place when the log was received). Then there will be just one code path for displaying the brief and full reports: either cat the file to the network, or uncompress it and cat it (in the case where the client doesn't set Accept-Encoding: gzip).
Attachment #446387 - Flags: feedback?(bear)
(In reply to comment #8)
> Created an attachment (id=446387) [details]
> PoC gzip-encoding for brief/full log reports
> 
> Here is a working proof-of-concept for the approach we discussed in IRC. 
> This does not implement caching (yet), but does the following:
> 
> * send back gzipped content if accept-encoding request header contains gzip
> * remove global dependencies from all functions in showlog.cgi and move the
> functions to tbglobals.pl (so they can be used by other scripts, to build the
> cache when email or notes come in)
> * remove print_fragment (I can't figure out how to get it working, am I wrong
> here? I've never seen this feature used)
> * remove print_log_section (looks like dead code, is not called from anywhere)
> * remove existing brief.html cache mechanism
> * comment out EmitHtmlTitleAndHeader until we decide what approach to take for
> output
> 
> This makes showlog.cgi much simpler, although there's still room for
> improvement. I am not sure if tbglobals.pl is the right place to move all this,
> but it seems to be the place for sharing code between different parts of
> tinderbox.

until a true refactoring takes place, tbglobals.pl is the least smelly place to place it IMO

> 
> I am wondering what you think about the general idea of passing around a
> filehandle that might be STDOUT or might be gzip (which may be writing to
> STDOUT or a file). This same approach could be used in other parts of
> Tinderbox, if it makes sense. In the case where we do have a cache we should be
> able to just read the file and write directly to STDOUT.

+1 to this - it will allow people who do downstream changes to the code to have a single variable to send data to without the propagation of if/else checks that otherwise would be required

> 
> Finally, I think we should construct the cache at request time if it does not
> exist for some reason (e.g. previous version of Tinderbox was in place when the
> log was received). Then there will be just one code path for displaying the
> brief and full reports: either cat the file to the network, or uncompress it
> and cat it (in the case where the client doesn't set Accept-Encoding: gzip).

again +1
Comment on attachment 446387 [details] [diff] [review]
PoC gzip-encoding for brief/full log reports

I don't see anything from a code reading, but for sure we are going to need to run this on a staging tinderbox and see what breaks.
Attachment #446387 - Flags: feedback?(bear) → feedback+
Thanks for the feedback! 

We do definitely need to run this on staging and bang on it for a bit when it's ready. I am a little concerned that the fragment feature might need to be re-added, but I really want to determine if it's actually working or not first.

Also I am worried that some of these functions may be modifying the vars with the assumption that they are global; I think that changing this is the right thing to do so it's probably worth the fallout (IMHO) to get away from using and mutating globals like this. 

I have been doing very basic local testing though and it seems ok; having an automated functional test suite (could be very basic) would be a huge timesaver here.
Attached patch add per-request caching and content-length (obsolete) — — Splinter Review
Build on the previous patch:

* check for cache file and build at request time if it does not exist
* add content-length header if cache file exists

Only remaining task (I think) is to tighten up and generalize this caching approach, and make it available to other code (so the caches can be built when the email is received and/or notes are updated).

showlog.cgi goes through quite a bit of machination to do this right now, I would like to simplify that so we don't have that same rigmarole in three places.
Attachment #446387 - Attachment is obsolete: true
Attachment #446455 - Flags: feedback?(bear)
I've been loading logs from production and comparing them in a test env and everything seems fine so far. I've diffed the output and only found expected differences. 

One thing this patch definitely needs is for showlog.cgi to check that the underlying build exists and exit as soon as possible (before writing cache files or doing any work, so unexpected errors don't cause corruption in the caches). Right now production gets pretty far in processing if you give it an invalid file so this is not a new issue, but it impacts the integrity of the cache so it's more important now: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/123.gz
I noticed something a little disturbing in production, the "brief" and "full" reports are exactly the same:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274332018.1274332772.29524.gz
vs.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274332018.1274332772.29524.gz&fulltext=1

Looks like the parsers don't really work anymore? Or maybe I am missing something here... the brief logs don't look very brief in any case. I checked a somewhat random sample of different logs, so maybe it still works in some cases.

With this gzip patch this test log goes from 4MB down to 400k, which is great for transfer time.. but loading that 4MB log into the browser is still pretty painful, at least on my MBP (Firefox is completely unresponsive for a second or two when the transfer is finished, which does not feel nice). Having a "brief" log that only shows errors would probably make this feel much nicer (which I think is how tinderbox is supposed to work, and used to work, and maybe still does sometimes?).

Maybe someone can shed light here on why the brief logs aren't very brief?
Attached patch Exit early if logfile does not exist. (obsolete) — — Splinter Review
Attachment #446455 - Attachment is obsolete: true
Attachment #446522 - Flags: feedback?(bear)
Attachment #446455 - Flags: feedback?(bear)
(In reply to comment #15)
> Maybe someone can shed light here on why the brief logs aren't very brief?

IIRC, if the errorparser doesn't find any errors, then it doesn't actually generate a brief log, you just get the full log.
(In reply to comment #17)
> (In reply to comment #15)
> > Maybe someone can shed light here on why the brief logs aren't very brief?
> 
> IIRC, if the errorparser doesn't find any errors, then it doesn't actually
> generate a brief log, you just get the full log.

That makes sense, thanks. I went looking for logs with errors and found that it's working as you describe, it skips a bunch of lines if there's an error:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274376458.1274377035.18775.gz

But no lines are skipped if there are no errors.

I am definitely veering outside of the scope of this bug, but I just want to point out that if you want a better experience loading logs in the browser from Tinderbox I definitely suggest cutting down on the size of the full logs if possible, and perhaps making brief logs briefer even in the case where there are no errors. Loading a file this large into the browser is painful no matter how it's transferred.
Probably deserves a second bug, yeah, that experience sucks.
Attached patch create gzip cache on addnote or processbuild (obsolete) — — Splinter Review
I have generalized the cache creation to a relatively simple function in tbglobals called create_cache:

create_cache($errorparser, $fulltext, $logfile, $tree, $buildtime, $buildname);

This is now the only entry point to all the old showlog.cgi code that I moved to tbglobals. 

The caches are first written to a tmp file, then moved when the process is finished (to avoid users seeing a partial log while the cache is being rebuilt, in the case that a cache is being re-created).

$fulltext is either 0 or 1, to mean brief or full (this matches the HTTP parameter passed to showlog.cgi).

This function is called in the following places:

1) in showlog.cgi, if the requested cache does not exist
2) in addnote.cgi (when a build is "starred")
3) in processbuild.pl (when a new build email is received).

I don't like how #2 makes the user wait when submitting a note. Any ideas for dealing with that?

Besides the concern with addnote.cgi, I think this is pretty much ready for a real review.
Attachment #446522 - Attachment is obsolete: true
Attachment #446572 - Flags: review?(bear)
Attachment #446522 - Flags: feedback?(bear)
Attached patch use PID in tmp cache filename (obsolete) — — Splinter Review
Same as last time, but allow for multiple processes attempting to rebuild the cache by putting the PID in the tmp cache filename (whichever process does the mv last wins).

There *might* be a race condition here but I think because the notes.txt is written before the cache files it will not be an issue if e.g. multiple users add notes around the same time.. 

However if needed then I don't think using a lock would be too onerous here since this shouldn't happen very often.
Attachment #446572 - Attachment is obsolete: true
Attachment #446576 - Flags: review?(bear)
Attachment #446572 - Flags: review?(bear)
Attached patch fixup banner/blurb printing (obsolete) — — Splinter Review
Attachment #446576 - Attachment is obsolete: true
Attachment #446579 - Flags: review?(bear)
Attachment #446576 - Flags: review?(bear)
Any ETA on this?  This is I believe the root cause of things like starring tinderbox orange taking so long; clicking on a log in tbpl currently takes around 20-30 seconds for the log to load on the office network, whereas with cache + gzip encoding, it takes around 1 second.  Would be a pretty big usability win there.
This has been leering at me from my review queue but Android and OS X work has trumped any time adjusting i've done.

Your comment helps push it higher in my queue - it's not forgotten tho
(In reply to comment #23)
> Any ETA on this?  This is I believe the root cause of things like starring
> tinderbox orange taking so long; clicking on a log in tbpl currently takes
> around 20-30 seconds for the log to load on the office network, whereas with
> cache + gzip encoding, it takes around 1 second.  Would be a pretty big
> usability win there.

Perhaps a shortcut instead of having to patch tinderbox server would be something like bug 571085 (have Apache gzip anything that comes out of showlog.cgi).

Tinderbox would still be doing gunzip and error-log parsing on the original log per-request so it is not as efficient as the patch I posted, but at least the HTTP output would be gzipped.
Depends on: 574524
(In reply to comment #25)
> Perhaps a shortcut instead of having to patch tinderbox server would be
> something like bug 571085 (have Apache gzip anything that comes out of
> showlog.cgi).
> 
> Tinderbox would still be doing gunzip and error-log parsing on the original log
> per-request so it is not as efficient as the patch I posted, but at least the
> HTTP output would be gzipped.

Tested locally and mod_deflate seems to work with CGIs ok, filed bug 574524 to offload the gzip-encoding job to Apache.
Apache mod_deflate doing gzip (bug 574524) is live on tinderbox-stage now and seems to help a lot, I am going to put more details on the tests I've done in that bug.

Can people please help test http://tinderbox-stage.mozilla.org so this can go live?
(In reply to comment #27)
> Can people please help test http://tinderbox-stage.mozilla.org so this can go
> live?

This seems fine in most cases but latest minefield just beachballs on my mbp for minutes if I load a particularly large one:

http://tinderbox-stage.mozilla.org/showlog.cgi?log=Firefox/1277845327.1277850370.21087.gz

The production equivalent of that is:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1277845327.1277850369.10000.gz&fulltext=1

The latter takes way longer to fully load but the browser is at least usable in the interim...
(In reply to comment #28)
> The production equivalent of that is:
> 
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1277845327.1277850369.10000.gz&fulltext=1

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1277845327.1277850369.10000.gz is more exactly equivalent, no significant difference in this case though.
This is because those pages are still served as UTF8, so we end up hitting bug 566066 -- we just hit it a lot more quickly when you get the gzip'd data.  It would be far, far better if these pages forced a content-encoding of ascii -- someone who really needs to see the UTF8 can just select that encoding manually.

Try loading

http://vlad.off.net/misc/fetchlog.php?log=http%3A%2F%2Ftinderbox.mozilla.org%2Fshowlog.cgi%3Flog%3DFirefox%2F1277845327.1277850369.10000.gz

which is that same log, except going through my script which forces the encoding.
Attachment #446579 - Attachment is obsolete: true
Attachment #446579 - Flags: review?(bear)
Attached patch force charset=us-ascii for showlog.cgi (obsolete) — — Splinter Review
Attachment #455055 - Flags: review?(bear)
Attached patch force charset=iso8859-1 for showlog.cgi (obsolete) — — Splinter Review
Attachment #455055 - Attachment is obsolete: true
Attachment #455057 - Flags: review?(bear)
Attachment #455055 - Flags: review?(bear)
Attachment #455057 - Attachment is patch: true
Attachment #455057 - Attachment mime type: application/octet-stream → text/plain
Attached patch force charset=iso8859-1 for showlog.cgi (obsolete) — — Splinter Review
oh bugzilla how I've missed your ways
Attachment #455057 - Attachment is obsolete: true
Attachment #455058 - Flags: review?(bear)
Attachment #455057 - Flags: review?(bear)
Attachment #455058 - Flags: review?(bear) → review+
Comment on attachment 455058 [details] [diff] [review]
force charset=iso8859-1 for showlog.cgi

This regresses bug 474657.
Attachment #455058 - Flags: review-
(In reply to comment #34)
> (From update of attachment 455058 [details] [diff] [review])
> This regresses bug 474657.

Yes it does. It's not clear to me from reading bug 474657 how bad it is (does it obscure errors? does it just look ugly?)

How bad is bug 474657? Is it worth regressing to get (much) faster-loading logs?

Failing that are there any alternative workarounds for bug 566066 we can consider?
(In reply to comment #35)
> (In reply to comment #34)
> > (From update of attachment 455058 [details] [diff] [review] [details])
> > This regresses bug 474657.
> 
> Yes it does. It's not clear to me from reading bug 474657 how bad it is (does
> it obscure errors? does it just look ugly?)
> 
> How bad is bug 474657? Is it worth regressing to get (much) faster-loading
> logs?

The dup bug 446505 has an example.
Regressing 474657 is perfectly fine, IMO -- we just need to communicate that people who need UTF8 can explicitly select that encoding.  The garbling in there is due to msvc using curly quotes for the output, but even then it's "straigtfroward" grabling, it doesn't make the log unreadable.
One way to fix the gcc using curly quotes issue is to just set LC_ALL=C in the build environment that gets sent to the buildmasters; that way it won't try to send any unicode goop.
(In reply to comment #38)
> One way to fix the gcc using curly quotes issue is to just set LC_ALL=C in the
> build environment that gets sent to the buildmasters; that way it won't try to
> send any unicode goop.

I will file a separate bug about that.
(In reply to comment #37)
> Regressing 474657 is perfectly fine, IMO -- we just need to communicate that
> people who need UTF8 can explicitly select that encoding.  The garbling in
> there is due to msvc using curly quotes for the output, but even then it's
> "straigtfroward" grabling, it doesn't make the log unreadable.

I agree, this seems like the lesser evil here. We're talking about several minutes versus a few seconds to load a typical build log (see comment #30).

reed made the point in irc that bug 566066 is a Firefox bug and should be fixed in Firefox. I am (very) sympathetic to that viewpoint, but I would argue that not having ASCII encoded, gzipped logs is slowing down development work on Firefox ;)

I feel like someone is going to be unhappy no matter what we do. I am inclined to regress bug 474657, does someone else need to break the tie here? Take it to the newsgroups (which one)?

Perhaps this issue should be spun out to a separate bug that depends on bug 566066 (so it can be reverted when that one is closed).
(In reply to comment #40)
> (In reply to comment #37)
> Perhaps this issue should be spun out to a separate bug that depends on bug
> 566066 (so it can be reverted when that one is closed).

On further reflection, why not reopen bug 474657 and discuss there? That is probably the right place to set dependencies like bug 566066 and setting the build environment locale to something else like comment #37 suggests.

My main concern is not surprising people who expect the current behavior, they are probably CC'd on bug 474657 for the most part.

I don't have time to do this right now but if no one else has a better plan I will do this sometime this weekend.
Depends on: 474657
(In reply to comment #41)
> (In reply to comment #40)
> > (In reply to comment #37)
> > Perhaps this issue should be spun out to a separate bug that depends on bug
> > 566066 (so it can be reverted when that one is closed).
> 
> On further reflection, why not reopen bug 474657 and discuss there? That is
> probably the right place to set dependencies like bug 566066 and setting the
> build environment locale to something else like comment #37 suggests.

Reopened bug 474657 and set to block this one.

(In reply to comment #39)
> (In reply to comment #38)
> > One way to fix the gcc using curly quotes issue is to just set LC_ALL=C in the
> > build environment that gets sent to the buildmasters; that way it won't try to
> > send any unicode goop.
> 
> I will file a separate bug about that.

I will file this (or not) depending on what happens in bug 474657, leaving it alone for now.
(In reply to comment #38)
> One way to fix the gcc using curly quotes issue is to just set LC_ALL=C in the
> build environment that gets sent to the buildmasters; that way it won't try to
> send any unicode goop.

Since bug 474657 was re-closed, I've opened bug 577310 to get this set in the build environment.

While this should help with build logs, I think there are legitimate cases when other types of logs (many test logs display filenames which contain unicode characters, I have been told).

This might be a good interim approach, until bug 566066 is fixed or an effective workaround is found:

1) patch tinderbox so encoding can be specified
2) if encoding is ascii, set content-type to text/html text/html; charset=iso8859-1
3) if encoding is anything else, set content-type to application/x-gzip

#3 means that UTF-8 (and anything non-ascii) will offer a gzip file for download, which the user can open in another application.
Depends on: 577310
Depends on: 566066
No longer depends on: 474657
(In reply to comment #28)
> (In reply to comment #27)
> > Can people please help test http://tinderbox-stage.mozilla.org so this can go
> > live?
> 
> This seems fine in most cases but latest minefield just beachballs on my mbp
> for minutes if I load a particularly large one:
> 
> http://tinderbox-stage.mozilla.org/showlog.cgi?log=Firefox/1277845327.1277850370.21087.gz
> 
> The production equivalent of that is:
> 
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1277845327.1277850369.10000.gz&fulltext=1
> 
> The latter takes way longer to fully load but the browser is at least usable in
> the interim...

The more I test this, the harder I find to reproduce it.. I'll try using Firefox instead of Minefield.

At this point I think the gzip-encoding change on tinderbox-stage is better overall and we should move forward with it while continuing to track these other issues.
Yeah I can't reproduce the severe beach-balling I was getting before loading that log from tinderbox-stage, using a fresh profile on the latest Firefox release or latest nightly.

Thanks for philor for trying (and not being able) to reproduce this.

I don't think there's a reason to hold for a fix or workaround on bug 566066. I'll file a bug to get this enabled in production.
Depends on: 577324
No longer depends on: 566066
Attachment #455058 - Attachment is obsolete: true
bug 577324 enabled gzip-encoding for all CGIs, which satisfies at least part of this bug. If there is any fallout due to this I will help address it.

Tinderbox *should* be doing caching for brief logs (although it's been inconsistent in my local testing) so the overhead of this versus what I was working towards with attachment 577324 [details] [diff] [review] should be minimal.

Barring any problems, I do not intend to keep working on this, if anyone else feels it needs to be reopened feel free to do so and change the assignee.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
(In reply to comment #46)
> Tinderbox *should* be doing caching for brief logs (although it's been
> inconsistent in my local testing) so the overhead of this versus what I was
> working towards with attachment 577324 [details] [diff] [review] should be minimal.

I meant to say attachment 446579 [details] [diff] [review]
The easy and effective workaround is to force iso-8859-1, which as phil stated in the other bug is to "intentionally serve tinderbox logs with an encoding which is not capable of representing all the characters in the log".

The net result of that change would be:

- For 95% of developers, they will just have very fast loading tinderbox pages, with random bits of garbage here or there from gcc (until bug 577310 is fixed)

- For the 5% of developers that need to see Thai line break test details, they can manually select a UTF8 encoding.

It's not like the data is lost or thrown away...
Filed bug 581164 on the remaining force-ascii issue to centralize discussion there.
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: