What are the common techniques to handle user-generated HTML modified differently by different browsers?
Posted
by
Jakie
on Programmers
See other posts from Programmers
or by Jakie
Published on 2011-10-07T01:41:00Z
Indexed on
2011/11/13
2:07 UTC
Read the original article
Hit count: 295
I am developing a website updater. The front end uses HTML, CSS and JavaScript, and the backend uses Python.
The way it works is that <p/>, <b/> and some other HTML elements can be updated by the user. To enable this, I load the webpage and, with JQuery, convert all those elements to <textarea/> elements. Once they the content of the text area is changed, I apply the change to the original elements and send it to a Python script to store the new content.
The problem is that I'm finding that different browsers change the original HTML.
- How do you get around this issue?
- What Python libraries do you use?
- What techniques or application designs do you use to avoid or overcome this issue?
The problems I found are:
- IE removes the quotes around
classandidattributes. For example,<img class='abc'/>becomes<img class=abc/>. - Firefox removes the backslash from the line breaks:
<br \>becomes<br>. - Some websites have very specific display technicalities, so an insertion of a simple "\n"(which IE does) can affect the display of a website. Example: changing
<img class='headingpic' /><div id="maincontent">to<img class='headingpic'/>\n <div id="maincontent">inserts a vertical gap in IE.
The things I have unsuccessfully tried to overcome these issues:
- Using either JQuery or Python to remove all
>\n<occurences,<br>etc. But this fails because I get different patterns in IE, sometimes a·\n, sometimes a\n···. - In a Python, parse the new HTML, extract the new text/content, insert it into the old HTML so the elements and format never change, just the content. This is very difficult and seems to be overkill.
© Programmers or respective owner