Page MenuHomePhabricator

Gerrit: Comment parser for "r123" should not interrupt in-word (e.g. urls)
Closed, ResolvedPublic

Description

The following commit message (rMWbe676ae4bb3c):

Commafy support for convertNumber
As per http://www.unicode.org/reports/tr35

Change-Id: I765298959ad10ea8561abfea487328c8ddeb39f0

Renders like:

Commafy support for convertNumber

As per r35">http://www.unicode.org/reports/tr35
Change-Id: I765298959ad10ea8561abfea487328c8ddeb39f0

HTML:

As per <a href="http://www.unicode.org/reports/t&lt;a href=">r35</a>"&gt;http://www.unicode.org/reports/t<a href="https://www.mediawiki.org/wiki/Special:CodeReview/MediaWiki/35">r35</a>


Version: wmf-deployment
Severity: normal

Details

Reference
bz45163

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 1:39 AM
bzimport added a project: Gerrit.
bzimport set Reference to bz45163.
bzimport added a subscriber: Unknown Object (MLST).

We had some attempts at fixing this (gerrit change 11589), but it wasn't perfect. There was some discussion upstream about making this less zealous (can't find the link offhand).

Someone just forgot a \b at the beginning and end of their regex.

Also, work like this was done for Bugzilla in Ib79728e9.

Re comment 3: For reasons beyond my limited regex skills, \b obviously didn't work as expected in Bugzilla, so I replaced it by (?:^|(?<=\s)) . Works now.

Yeah, I should have mentioned - IIRC \b matched between an equals sign and a letter, which meant still breaking URLs. (?:^|(?<=\s)) matches either the beginning of string or after any whitespace character, thus works.

Well, even \b would be an improvement for gerrit as right now it is also replacing things inside words (eg. ?x=foobar123)

If someone can point me to where these regexes live, I'll submit a patch to fix them.

Merged.

Marking this as fixed; it will probably still break if someone tries to, say, link to gitweb, but that's because gerrit's "regexes" make this pretty much impossible to implement properly like it was done for Bugzilla's parser.