Page MenuHomePhabricator

Wikimedia 404 error page shouldn't use a Refresh HTTP header to implement the auto-redirect to /wiki
Closed, ResolvedPublic

Description

The 404 page used to have a meta-refresh tag. Now it apparently has a Refresh HTTP header:


mzmcbride@gonzo:~$ curl -I "http://en.wikipedia.org/wfhsdklfjsdklfj"
HTTP/1.0 404 Not Found
Date: Thu, 08 Mar 2012 04:23:19 GMT
Server: Apache
Cache-Control: s-maxage=2678400, max-age=2678400
X-Wikimedia-Debug: prot=http:// serv=en.wikipedia.org loc=/wfhsdklfjsdklfj
Refresh: 5; url=http://en.wikipedia.org/wiki/wfhsdklfjsdklfj
Content-Length: 5091
Content-Type: text/html; charset=utf-8
Age: 166
X-Cache: HIT from cp1019.eqiad.wmnet
X-Cache-Lookup: HIT from cp1019.eqiad.wmnet:3128
X-Cache: MISS from cp1008.eqiad.wmnet
X-Cache-Lookup: MISS from cp1008.eqiad.wmnet:80

Connection: close

Auto-refreshes/auto-redirects like this are generally considered terrible from an accessibility standpoint. This header should simply be removed.

This is kind of related to bug 17316, but not really.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=54357

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:18 AM
bzimport set Reference to bz35052.
bzimport added a subscriber: Unknown Object (MLST).

Bug 17316 says

  • It has a meta refresh on the client-side, which is pretty much universally

discouraged for accessibility purposes;

You say

Auto-refreshes/auto-redirects like this are generally considered terrible from
an accessibility standpoint. This header should simply be removed.

Neither of you pointed to the applicable guidelines.

(In reply to comment #1)

Neither of you pointed to the applicable guidelines.

What's your point?

(In reply to comment #2)

(In reply to comment #1)

Neither of you pointed to the applicable guidelines.

What's your point?

My point is that those guidelines are needed to resolve this problem. At the very least, we should have the guidelines so anyone can verify the above assertions.

(In reply to comment #3)

My point is that those guidelines are needed to resolve this problem. At the
very least, we should have the guidelines so anyone can verify the above
assertions.

I don't really think it's necessary to re-debate what was properly decided (and deprecated) over ten years ago, but if someone would like to on this bug, I'm prepared.

That guideline says not to use meta refresh in place of HTTP redirect.

ie: A redirect page with nothing on it but a meta redirect and text saying the user will be redirected.

This isn't a server redirect replacement. This is a 404 page that is intended as a 404 page and includes a meta redirect to make a guess of where a user 'might' want to go.

Personally I want to eventually handle 404's internally within MediaWiki.

(In reply to comment #5)

That guideline says not to use meta refresh in place of HTTP redirect.

ie: A redirect page with nothing on it but a meta redirect and text saying the
user will be redirected.

This isn't a server redirect replacement. This is a 404 page that is intended
as a 404 page and includes a meta redirect to make a guess of where a user
'might' want to go.

Except for what percent of users who aren't able to read the error page before being auto-redirected?

The goal of the guideline was to clarify that redirects should be automatic or not, as browser control over meta-refresh tags (delayed redirects) wasn't (and still isn't) feasible. Even browsers with the ability to disable meta-refresh tags use such an obscure system that it's limited to only power-users. And it's exactly the opposite kind of user who needs to have the page displayed for a longer period of time usually.

Auto-redirects are bad for accessibility. Providing a link to the possible intended target (the "guess," as you call it) is completely sufficient here, isn't it?

Interestingly, if the URL path begins with "w/" (cf. bug 54357), no "Refresh" header is output. Compare:


mzmcbride@tools-login:~$ curl -I "http://en.wikipedia.org/o"
HTTP/1.0 404 Not Found
Date: Tue, 24 Sep 2013 21:30:29 GMT
Server: Apache
Cache-Control: s-maxage=2678400, max-age=2678400
X-Wikimedia-Debug: prot=http:// serv=en.wikipedia.org loc=/o
Refresh: 5; url=http://en.wikipedia.org/wiki/o
Content-Length: 2786
Content-Type: text/html; charset=utf-8
X-Cache: MISS from cp1007.eqiad.wmnet
X-Cache-Lookup: MISS from cp1007.eqiad.wmnet:3128
X-Cache: MISS from cp1017.eqiad.wmnet
X-Cache-Lookup: MISS from cp1017.eqiad.wmnet:80

Connection: close

Here we see the "Refresh" header, confirming comment 0.

Compare to:


mzmcbride@tools-login:~$ curl -I "http://en.wikipedia.org/w/o"
HTTP/1.0 404 Not Found
Date: Tue, 24 Sep 2013 21:25:25 GMT
Server: Apache
Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
X-Wikimedia-Debug: prot=https:// serv=en.wikipedia.org loc=/w/o
Content-Length: 2755
Content-Type: text/html; charset=utf-8
Age: 327
X-Cache: HIT from cp1020.eqiad.wmnet
X-Cache-Lookup: HIT from cp1020.eqiad.wmnet:3128
X-Cache: MISS from cp1005.eqiad.wmnet
X-Cache-Lookup: MISS from cp1005.eqiad.wmnet:80

Connection: close

No "Refresh" header.

The code that does this is in operations/mediawiki-config.git, specifically w/404.php. You could submit a patch.

Change 233664 had a related patch set uploaded (by MZMcBride):
Remove auto-redirection from 404 page.

https://gerrit.wikimedia.org/r/233664

Adding Web-Team-Backlog and Discovery-ARCHIVED in case someone from either of those two teams wants to comment

The guidelines from the W3C are emphatic on this point, so I don't think extensive discussion is necessary. I do think that the Design and #usability teams ought to have a look at the 404 page and make sure that the text and presentation of the 404 page is maximally helpful.

Change 233664 merged by jenkins-bot:
Remove auto-redirection from 404 page.

https://gerrit.wikimedia.org/r/233664

Since Discovery was requested to comment here, I'll state that I have no particularly strong opinion on this task one way or another. To me, the change seems sensible enough given that the user will barely have enough time to read the 404 page before being automatically redirected.

Nemo_bis assigned this task to MZMcBride.
Nemo_bis set Security to None.
In T37052#1576416, @ori wrote:

The guidelines from the W3C are emphatic on this point, so I don't think extensive discussion is necessary. I do think that the Design and #usability teams ought to have a look at the 404 page and make sure that the text and presentation of the 404 page is maximally helpful.

Filed as T110376.

The goal of the delayed redirect was to be annoying to the reader (so he doesn't "learn" not to type "/wiki/") while being good enough to do what most people wanted. It's suboptimal that your granny didn't have enough time to read (and understand!) the error message. But I worry now if she will be kept wait on that page and do nothing.

Perhaps we can get some stats of root 404s not followed by another page view? (not easy to gather, I know…)