Opened 6 years ago

Closed 5 years ago

#1750 closed bug (fixed)

html escaping and filtering user generated text before display

Reported by: globetrotter_tt Owned by: lantti
Priority: critical Milestone: 1.8
Component: BW General Keywords: escaping
Cc: planetcruiser, jsfan, shevek, mikael

Description

lantti has already done some work on it. See merge request: https://gitorious.org/bewelcome/rox/merge_requests/5

Change History (56)

comment:1 Changed 6 years ago by lantti

  • Owner set to lantti
  • Status changed from new to accepted

comment:2 Changed 6 years ago by lantti

The changes made so far:

Changes that affect all or almost all pages:

  • Page title text is now html escaped before inserting it to the response. At least Groups and Forums use user supplied text as page title
  • Various login buttons login links content is html escaped. If not logged in the login buttons are shown and there are parts of the original request url appended to their links
  • The translation links in the footer are html escaped. Like login buttons, they also copy parts of the request url.

In Groups and sub-pages:

  • The name or description of a group is html escaped before inserting to response on various pages that use this information. Noticed just now that the description should maybe not be completely html escaped, but run through HTMLPurifier instead. Will fix that next.

In Blogs:

  • The category names are html escaped before inserting to response.
  • The teaser text is html escaped before inserting to response (contains parts of the request url).

In Profile:

  • The names of the groups the user belongs to are html escaped before inserting to response.

In profiles all comments page:

  • All comment text is run through HTMLPurifier before inserting to response.

In Forums:

  • Forum name in the navigation path text is html escaped before inserting to response. Also affects Groups where the info about group forum is used.

In Login widget:

  • Login links content is html escaped like with the login buttons elsewhere.

In front page:

  • Language selection menu items link content is html escaped before insertion to response.

In Quicksearch:

  • Original search string is html escaped before insertion to the response.

In Trips:

  • Original search string is html escaped before insertion to the response when using the searchbox in the teaser.

In Shouts:

  • Actually I have no idea what this is. But there seemed to be a login button with exactly the same code as all other login buttons elsewhere, so I though to make the identical fix here too.

In Invite Friend:

  • All content in the pre-filled form text fields is html escaped because in some cases it gets copied from previous user input.

comment:3 Changed 6 years ago by lantti

Changed the call to htmlspecialchars to a call to HTMLPurifier instead when inserting Group Description to the Group page

comment:4 Changed 6 years ago by jsfan

  • Milestone changed from unassigned to 1.0-sec

comment:5 Changed 6 years ago by lantti

Found and fixed some cases I had missed before in groups, trips and blogs.

comment:6 Changed 6 years ago by jsfan

Currently, the filter is applied to the group name in a profile in a manner that filters out all links for translators as well.

comment:7 Changed 6 years ago by planetcruiser

@lantti: hehe, i bet you didn't expect translation mark-up in the return values of i18n calls, did you? welcome to rox (again)! :-p i needed to switch off the computer and go for a walk when i first saw this. ;)

yesterday i found php code in the database table for preferences fields (echo '<option>blabla</option>' etc.).. you can imagine my face

well, happy hacking!

may the rewrite be neigh

comment:8 follow-up: Changed 6 years ago by lantti

Fixed breaking translation links by disabling translation of group names in mygroups. The group names are not translated on the actual groups pages anyway, plus translating them means using user supplied text as keys to the words database. As far as I see only bw translators would have been able to translate these names anyway so for average groups user it makes almost no difference.

comment:9 in reply to: ↑ 8 Changed 6 years ago by globetrotter_tt

Replying to lantti:

Fixed breaking translation links by disabling translation of group names in mygroups. The group names are not translated on the actual groups pages anyway, plus translating them means using user supplied text as keys to the words database. As far as I see only bw translators would have been able to translate these names anyway so for average groups user it makes almost no difference.

I am not sure if translators should be able to translate the content that created by other users. For group names and description i don't see a good reason for translating them. Usually there is only one language spoken inside a certain group.

comment:10 Changed 6 years ago by planetcruiser

do we have translated group names at all? could someone check? if not, i am fine with removing this feature, too. if we do have sensible translations, we need to think of a way to migrate them first.

comment:11 Changed 6 years ago by planetcruiser

this also needs a lot of testing of all the effected features. main questions:

  • is html still allowed everywhere where it used to be allowed?
  • does old text with html markup from the database entered by users still appear the same way on the page?

comment:12 Changed 6 years ago by jsfan

I believe there are some translated group names. However, it seems to me that they are pretty much never displayed anyway. If you search for "Journalists and Writers" and switch to Spanish, you will still see "Journalists and Writers". However, if you go to one of the members' profiles and look at the listing there, you see the translation.

I reckon given the inconsistency rendering the translations close to useless already, I'd just drop them. I don't think there a whole lot of them, anyway.

In a broader sense, I'd suggest that with the preliminary testing that has happened now, we should take our chances. This is a big security fix and it is likely that it will still break something somewhere even if we test really well.

comment:13 Changed 6 years ago by lantti

To planetcruiser:

  • is html still allowed everywhere where it used to be allowed?

Definitely not. For example group names, search strings, blog category names etc. would have to do without html markup from now on. I mean it used to be allowed everywhere as far as I see. If there is a decision/documentation somewhere where it says where it was supposed to be allowed, I'll be happy to go through it and check it stays allowed in those places.

  • does old text with html markup from the database entered by users still appear the same way on the page?

I made changes only where I found it absolutely needed. The html entered into the database renders differently depending through which page it is viewed. For example the same profile comments behave differently if viewed on the profile page or comments page because profile page uses strip_tags while comments page used to display them raw (now using htmlpurifier). All in all the answer is the same as above. Definitely not.

comment:14 Changed 6 years ago by lantti

About the translation links, I can also make a gentler fix using htmlpurifier. I just thought it generally a bad practise to use user text as database keys. With the system as it is now you could accidently name your group with an existing translated word from somewhere else and get royally confused about it.

comment:15 Changed 6 years ago by crumbking

I suggest to use purifier everywhere.

See: http://gitorious.org/bewelcome/rox/blobs/master/modules/htmlpurify/lib/htmlpurify.lib.php#line63

I would like to have basic html and linkify working on the profile.

Also in group description it would be great.

All other forms should run through: getPurifier().

We could add:

    public function getHtmlLinkifyPurifier()
    {
        require_once(SCRIPT_BASE . 'lib/htmlpurifier/library/HTMLPurifier.standalone.php');
        $config = HTMLPurifier_Config::createDefault();
        $config->set('HTML.Allowed', 'p,b,a[href],br,i,strong,em,ol,ul,li,dl,dt,dd');
        $config->set('AutoFormat.AutoParagraph', true); // automatically turn double newlines into paragraphs
        $config->set('AutoFormat.Linkify', true); // automatically turn stuff like http://domain.com into links
        return new HTMLPurifier($config);
    }

comment:16 Changed 6 years ago by lantti

I think using htmlpurifier everywhere is a bit excessive. I see no reason for allowing html in page titles and blog category names for example. Also allowing html everywhere where it would be sensible to allow it is a bit outside the security scope already. This fix was supposed to just prevent people from forcing javascript etc. to each others web browsers. Maybe open another ticket for allowing html in profile and wherever it is missing now?

comment:17 Changed 6 years ago by jsfan

+1 for lantii

I also think that there is no need to overdo it on HTML. I'd lock it down for now and then possibly open it up a bit again as necessary. I'd like to get this over with, so we don't expose BW's members to XSS any longer. I consider it a disgrace that we do and don't really mind breaking insignificant features to achieve better security.

Last edited 6 years ago by jsfan (previous) (diff)

comment:18 Changed 6 years ago by jsfan

  • Groups: I reckon there is few translations. I'd just do it.
  • Blogs: clicking through the listings, I can't see anything breaking on alpha atm.
  • Profile: (group listings) No more issues here. (comments) Haven't encountered any problems so far.
  • Login: Not sure I have seen this in action but no issues to report.
  • Front page: No issues.
  • Quick Search: No alternative from a security perspective, I'd say.
  • Trips: Can't see the input displayed anywhere. All god from my POV.
  • Shouts: Wouldn't know that they are used.
  • Invites: This one might be problematic. I see escaped HTML in the actual email which means thje link in the HTML isn't clickable any more. By default there is a link to the member's profile embedded. Where is this displayed on the site again? Could we purify this instead?

comment:19 Changed 6 years ago by lantti

  • Invites: This one might be problematic. I see escaped HTML in the actual email which means thje link in the HTML isn't clickable any more. By default there is a link to the member's profile embedded. Where is this displayed on the site again? Could we purify this instead?

The problem is that the link there is already html escaped so now it gets escaped twice. We shouldn't use htmlpurify there because it is not our intention to allow html in displaying the textbox itself, but just in the message. Actually if you write a link in that box and send, the link gets sent correctly. I'll try to find where the link in the pre-filled text gets escaped...

comment:20 Changed 6 years ago by lantti

Found and fixed the breaking link in the invite message

comment:21 follow-up: Changed 6 years ago by crumbking

comment:22 Changed 6 years ago by midsch

I played around a little bit on alpha with inserting html into fields:

<i>invers</i> linebreak:<br/>
<b>bold</b>
<h1>headline</h1>
<ul><li>list</li></ul>
<a href="http://bewelco.me">netz</a>
http://bewelcome.org
<div style="color:red">with style</div>
  • mapsearch: html mostly ignored, but some weird search results (I'd like to check this after OSM-switch)
  • profile
    • profile summary: html can be stored and is display again in the field for editing, but if you look at the profil page only bold/inverse work. Headlines and div don't do anything, ul/li does something but don't look correct ...
    • Occupation: all html went away while saving
    • Contact information: all html in all fields went away while saving
    • Accomodation: all html in all fields went away while saving
    • My interest: all html in all fields went away while saving
    • Family and close Friends: all html in all fields went away while saving, unlike the other textfields http://$somelink isn't turned into a link
  • Wiki: The string looks shit after saving, but only invers and bold work like html

I should have added image-tags and unclosed tags, but I'm to tired now ...

Last edited 6 years ago by midsch (previous) (diff)

comment:23 Changed 6 years ago by midsch

Oh, and a html-tag within passwords doesn't matter, it works as password. (It's hopefully not stored in plain text anyway.)

comment:24 Changed 6 years ago by jsfan

Added escaping of the username when output again in error message after failed input. Thanks, midsch.

https://gitorious.org/bewelcome/rox/commit/129c03d2dbd832ef21ce3dc9cec647d6b96b1658/diffs/44350aca7183e90cd96ffd412d7c997f1fa8da8a

comment:25 Changed 6 years ago by jsfan

I'm happy to close this one. Anyone seconding?

comment:26 Changed 6 years ago by planetcruiser

i will go through everything as well later today. i will close if everything is ok and no one else objects

comment:27 in reply to: ↑ 21 Changed 6 years ago by crumbking

Replying to crumbking:

http://gitorious.org/bewelcome/rox/commit/833d984242eba43c591c67c90e67742f251d1497

Added HTML and Linkify to the profile forms. (purify)

Arrg... as midch found out: only the profile summary works. No effects in the other fields.

This isn't urgent and the wrong ticket for it. (right one #1582). In case we wanna move on, here ;-)

comment:28 Changed 6 years ago by jsfan

Yes, this ticket relates to mitigating XSS problems in output, so no further patches please to fix where what kind of HTML is allowed.

If there is no escaped HTML displayed anywhere, we can close this ticket. That's all this ticket is about.

comment:29 Changed 6 years ago by beatnickgr

  • Resolution set to fixed
  • Status changed from accepted to closed

I've tested the page and it looks like it's working.

comment:30 Changed 6 years ago by crumbking

  • Resolution fixed deleted
  • Status changed from closed to reopened

In Groups and sub-pages:

The name or description of a group is html escaped before inserting to response on various >pages that use this information. Noticed just now that the description should maybe not be >completely html escaped, but run through HTMLPurifier instead. Will fix that next.

There is somesting not really working here:

See: http://www.bewelcome.org/groups/211

I added a link: <a href="http://www.bewelcome.org/wiki/Halle">Halle Wiki</a>

This happens after saving the form a couple of times:

<a href=\\\\\\\"http://www.bewelcome.org\\\\/wiki/Halle\\\\\\\">Halle Wiki</a>

I reopen this issue but we might wanna move it to a seperate ticket.

comment:31 Changed 6 years ago by lantti

Oops. The description text gets escaped multiple times when you save the description multiple times. I'll start thinking about fixing this somehow.

comment:32 Changed 6 years ago by lantti

Btw. this could maybe go to a separate ticket because it is not html escaping issue but sql escaping one.

comment:33 Changed 6 years ago by crumbking

Okay let's create one.

comment:34 Changed 6 years ago by lantti

  • Resolution set to fixed
  • Status changed from reopened to closed

Done #1780

comment:35 Changed 6 years ago by lantti

  • Milestone changed from 1.1 to unassigned
  • Resolution fixed deleted
  • Status changed from closed to reopened

simison found some more open xss exploits so I'm reopening this and start working on closing them...

comment:36 Changed 6 years ago by mikael

Could these get pass the regular milestone cycle by any change?

Two (edit: three) of the XSS are really bad. With another one I could run any JS on anybody's browser having BW open. Another one is on just specific pages, but still.

Additionally bunch of more typical XSS's. Haven't been trying to inject SQL, but it kinda smells like there would be something like that, too.

Last edited 6 years ago by mikael (previous) (diff)

comment:37 Changed 6 years ago by jsfan

SQL injection is pretty hard because of Suhosin. Not that I would want to rely on that but I've never been successful when I tried SQL injection on a server which used Suhosin.

I don't think we need a separate security release for the XSS problems if we keep to our release cycle and release 1.6 in 2 weeks, anyway.

comment:38 Changed 6 years ago by jsfan

  • Cc shevek added

comment:39 Changed 6 years ago by shevek

  • Milestone changed from unassigned to 1.6

Added to 1.6 milestone.

comment:40 Changed 6 years ago by mikael

@jsfan: Only place where I could get some SQL stuff really showing up was here http://www.bewelcome.org/bw/viewcomments.php?cid=foobar

comment:41 Changed 6 years ago by lantti

I wrote some fixes. There are still more to fix, but the db based ones that Mikael found should be fixed now. Those were the blog tag suggetions when a tag contained html and group name shown on the group search page when a group name contained html. There are also some fixes against html included on the request urls for the following pages: /feedback, /messages, verifymembers/verifiersof, /forums, /searchmembers

https://www.gitorious.org/bewelcome/rox/merge_requests/10

comment:42 Changed 6 years ago by shevek

  • Status changed from reopened to local_testing

Merged into develop.

Please test locally.

comment:43 Changed 6 years ago by shevek

Did a code review. Can't see anything wrong.

Any suggestion how to test this?

comment:44 Changed 6 years ago by crumbking

Mmm I tried to add some tags in the blog app. Works.

We should simply test on alpha I would say. Other ideas?

Should we invite mikael? So he could test?

comment:45 Changed 6 years ago by shevek

  • Cc mikael added

@mikael: Could you test locally? Otherwise I'd push that to alpha to be tested.

comment:46 Changed 6 years ago by mikael

@shevek: sorry, I don't have rox locally installed and I'm now on the road for two weeks. I can test these at alpha, just ping me when.

comment:47 Changed 6 years ago by shevek

  • Status changed from local_testing to to_alpha

Okay, we deploy to alpha then.

comment:48 Changed 6 years ago by mikael

@shevek, did you already? If not, could you pretty please ping me from here or at IRC (simison) when it's there. I'm seeing my parents & friends in Finland until April so I'm not following mailinglist / computer stuff so much.

If somebody else wants to test, here's good cheat sheet: https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet

comment:49 Changed 6 years ago by shevek

  • Status changed from to_alpha to testing

comment:50 Changed 6 years ago by mikael

Hmmh, sorry guys, looks like I've been way too busy with friends, mom, work and sis' wedding this week/end, absolutelly no time to check out this one. Thanks for IRC ping shevek anyways. ;-)

comment:51 Changed 6 years ago by shevek

  • Resolution set to fixed
  • Status changed from testing to closed

I did some checks and no nasty behaviour on the site that I could see. Closing as the code review also says its fine.

@mikael: Would be great though if you could check as well.

comment:52 Changed 5 years ago by mikael

  • Milestone changed from 1.6 to unassigned
  • Resolution fixed deleted
  • Status changed from closed to reopened

Found some more XSS (also one persistent) and gave @lantti info about it. Should be fixed asap.

comment:53 Changed 5 years ago by lantti

  • Milestone changed from unassigned to 1.8
  • Status changed from reopened to local_testing

fixed the things mikael mentioned. Those were album and image names and descriptions on various pages under galleries. I was spraying htmlspecialchars and htmlpurifier on top a bit, but there are a lot of page and template files in that directory and it is hard to understand which are still in use and which are not, so I might have skipped some that are still in use and I just failed to understand where. So please test.

comment:54 Changed 5 years ago by shevek

  • Status changed from local_testing to to_alpha

Deployed to alpha, please test.

comment:55 Changed 5 years ago by shevek

  • Status changed from to_alpha to testing

comment:56 Changed 5 years ago by shevek

  • Resolution set to fixed
  • Status changed from testing to closed

Reviewed the code and checked the gallery. Still works. Closed as fixed.

Note: See TracTickets for help on using tickets.