Friday, October 30, 2020

20 Year Old Bug and Counting

That selfishness character in some open-source projects

 Continuing with the topic of web servers and my work, as of the last days, of writing up some hands-on examples, I'm writing this post about a bug I found in Firefox. It's not new. 

 Before I do so, I do not want to pass the opportunity to touch on a topic that has been bugging me for quite some time now.

 Let me start with a few examples.

I downloaded the source-code of Firefox -read below why. Nothing on the Mozilla project page alerts you of the size of the download required. It turns out that code is about 4GiB of download. But then you have to run some configuration scripts which also download some more stuff.

Don't ask me how much though: it's not mentioned nor notified anywhere!

That download forced me to update my Homebrew installation on my laptop. Yeah, I hadn't done so in quite a while. Hence, there where plenty of packages that were updated. Well, not always do you get warned about how much data it's gonna cost you nor will it ask you for confirmation to proceed.

While writing my previous post I decided to test the download of River, the Apache project  originally developed by Sun under the name of Jini: Nowhere does it say how large is it gonna be such download...although this time a gzed-tarball ended up being somewhat less than 6MiB.

While browsing Bugzilla about the aforementioned bug, I came up way to often with a reply by what seemed Mozilla developers asking the posters commenting on buggy behaviour for a minimal example that would showcase the problem. It so is the case that what brought me to find out that same bug was precisely writing a minimal-case scenario for use with Firefox, namely, a basic server delivering some very simple HTML code back to the client.

So, developers of a huge project like Firefox, answering a question related to networking...and they cannot write themselves a minimal server in C that serves some 10 lines of HTML?

 When I was doing Sys Admin work and would receive some request about an incident or something not working as they expected, I would always 

  1. Quickly try to guess what might really be going on, then
  2. Come up with a minimal example showcasing what my guess was, and only then
  3. Ask the user for more clarification, if my little experiment wouldn't replicate the problem.

I find both types of examples unhelpful and selfish -for the lack of a better word right now. That's all I'm gonna say about that and open-source right now.

 

A 20 year old bug

And now about what I promised.  Here is the bug:

Any web page whose HTML code starts like the following will trigger the bug in Firefox

<!doctype html>
<html>
<head>
<title>A 20yr-old bug</title>
<meta charset='utf-8'>
</head> 
<body>...</body>
</html>
The bug consist in Firefox duplicating the GET request: The full pages loads twice!!

While this seems irrelevant, there are case scenarios where this can become an actual burden. 
 
Some mobile plans do not offer that much of data, and what they offer is expensive. 
 
Among those plans are Roaming. The following roaming plan is the first that popped up when googling (prices are in CAD):
 
| Area \ Co. | Rogers  ($/MiB) |
| US            |           0.16           |
| EU            |           0.50           |
| Asia          |           1.00           |
| South A.   |           1.00           |

Mind you, a typical newspaper front page, e.g., the one from The Guardian below amounts to somewhat over 1MiB of data -there is not just text, but also pictures & ads.

So, If you went to visit your family in Malaysia and check out a newspaper with such an "stutter"-triggering HTML code, you'd be paying $\$$2 per page view if you use Firefox! Double as much as if you were using Safari, $\$1$. I think Chrome also works well -haven't checked yet-, hence only $\$$1 as well.

 

Why?

Because, not knowing better, those at Mozilla decided once that it was easier for the developers to force the browser requesting again the page every time the charset meta tag is not the first tag within the head tag!

So, this HTML code doesn't trigger the bug:

<!doctype html>
<html>
<head>
<meta charset='utf-8'>
<title>A 20yr-old bug</title>
</head> 
<body>...</body>
</html>

 Here the main bugzilla entries related to this tag. Notice how along these over 20 years the bug has been repeatedly rediscovered, fixed or simple disregarded!

  1. Bugzilla entry opened 20yrs ago updated 10 months ago. That is, it's still open -as of October 2020!
  2. Opened 17yrs ago, closed 8yrs ago...Status: Resolved!
Various postings in both of them show people have been re-discovering it and landing on either bug entry depending on their luck.
 

How often does that happen?

I don't know, yet.
 
But I do know that the newspaper The Guardian will trigger that bug
 

The same applies to The New York Times

or the french one  Le Figaro

 
Newspapers like Toronto StarZeit Online, or El Diario.es are "stutter" free: they have a charset meta tag that come as the first tag within the head tag.

All this applies only to their front pages, which is the only HTML code I checked.
 
Google search page in Canada doesn't seem to have any charset tag, so it won't trigger this bug.

Ah, that's funny what I just saw: Using Safari at least, as the pics above show, Apple's web page www.apple.com does not show any source code! Very reassuring, Apple... Ok, checked with Firefox's Inspector  (under tools->web developer menu) and, indeed, apple has a charset meta tag deep down its head...

I need to double check this bug while capturing packets with a tcpdump... This duplication of requests could be happening way more often than I initially thought every time we use Firefox to browse the web!
 
I'll update this post accordingly. Stay tuned.

A minimal setup to trigger the bug

You can do it just in a high-level, meaning setting up a webserver with a counter of how many users have requested a given page and showing that number to any one who connects.

Another way is compiling the following C code and running it. If you do so in your local machine, point your browser to http://localhost:8080. Do it using Firefox and other browser like Safari, Chrome, Opera, Konqueror...
 
Whatch how the counter jumps always by 2 when using Firefox!
 
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <netdb.h>
#include <string.h>     //memset
#include <unistd.h>        //close

#include <stdio.h>        //printf
#include <stdlib.h>        //calloc

#define PORT "8080"
#define RDBUFLEN 1024
#define WRBUFLEN RDBUFLEN

int main(){
    struct addrinfo hint;
    memset(&hint,0,sizeof(hint) );
    hint.ai_family = AF_INET6 ;
    hint.ai_socktype = SOCK_STREAM ;
    hint.ai_flags = AI_PASSIVE ; //for use w/ bind

    //getaddrinfo
    struct addrinfo *sadd ;
    getaddrinfo(0, PORT, &hint, &sadd);

    //socket
    int server_s = socket(sadd->ai_family,
                sadd->ai_socktype,
                sadd->ai_protocol
    );

    //bind
    bind(server_s, sadd->ai_addr, sadd->ai_addrlen);

    //listen
    listen(server_s,3);

    //accept: blocking : waits for client to connect
    struct sockaddr cadd;
    socklen_t cadd_len = sizeof(cadd);
    int client_s = 1;
    int n=0;
    while( (client_s=accept(server_s, &cadd, &cadd_len) )>0 ){

        //recv
        char*  buf = (char*) calloc(RDBUFLEN,sizeof(char));

        size_t rcvd = recv(client_s, buf, RDBUFLEN,0);
        *(buf+rcvd)='\0';
        printf("Received:\n%s\n",buf);

        //send
        char response[WRBUFLEN] =
            "HTTP/1.1 200 OK\r\n"
            "Connection: close\r\n"
            "Content-type: text/html\r\n\r\n";
        send(client_s,response,WRBUFLEN,0);

        sprintf(response,
            "<!doctype html>\r\n"
            //"<html><head><meta charset='utf-8'><title>Lemon Inc.</title></head>\r\n"      //Firefox sends 1 request only
            //"<html><head><title>Lemon Inc.</title></head>\r\n"                          //Firefox sends 1 request only
            "<html><head><title>Lemon Inc.</title><meta charset='utf-8'></head>\r\n"  //Buggy Firefox sends 2 requests!
            "<body style='background-color:#000000;color:#fff'>\r\n"
            "<h1 style='font-size:48;color:#ffff33'>Lemon Inc.</h1>\r\n"
            "<p>Your browser sent me following details:</p>\r\n"
            "<div style='text-align:left;width:50%%;margin:auto;'>\r\n"
        );
        send(client_s,response,strlen(response),0);

        int sent=send(client_s, buf , rcvd,0);

        sprintf(response,"</div><br>Number of connections today: %d\r\n",++n);
        send(client_s, response , strlen(response), 0);

        sprintf(response , "</body></html>\r\n");
        send(client_s, response , strlen(response), 0);

        close(client_s);
    }
    //close
    close(server_s);
    printf("\n\nServer shutdown\n");
    return 0;
}
 

Conclusions

I've been adding a conclusions section in my posts as of lately. Not sure what to conclude yet...

I do distrust projects like Mozilla Firefox that have fancy webpages but then deliver their code requiring a combination of software management tools or languages just for getting that code and setting up the developing software.

In this case those are Homebrew, Python and Mercurial...and their own "bootstrapping script".

And then there goes that script and tells you to install Mercurial via pip3 install Mercurial, while the webpage docs of Mozilla mention to use Homebrew for installing that VCS...

I followed the script instructions as that was running right there in my terminal. 

Well, it turns out, it's wrong. Mercurial is expected in a cellar under the Homebrew installation on your Mac.

I'm not going to bother more with Firefox. I was curious to see how complicated it would be to tweak the networking code. But it's far too much trouble: I don't feel comfortable not knowing exactly where some stuff is being installed, how easy it is to get rid of or to what extend it might mess up some system-wide configurations. Ok, you can find that out eventually. But it's not that I constantly pip install things. So it bothers me. Now I have to clean my laptop from all that was installed...

The attitude towards the bug itself as shown in those bugzilla reports and this way of delivering a development framework, that doesn't actually work, is a bit of paradigmatic of these times of fancy high-level programming in Python, Ruby, Perl....



No comments:

Post a Comment