Remove HTML tags from input text using Python
The following code snippet demonstrates how to remove HTML tags from an input text using Python.
Article Metadata
Tested with
Devices(s): Nokia E50, Nokia 5800 XpressMusic
Compatibility
Platform(s): S60 1st Edition, S60 2nd Edition, S60 3rd Edition, S60 5th Edition
Article
Created: sajisoft
(28 Aug 2009)
Last edited: hamishwillee
(18 Sep 2012)
Source code
def remove_tags(input_text):
# convert in_text to a mutable object (e.g. list)
s_list = list(input_text)
i,j = 0,0
while i < len(s_list):
# iterate until a left-angle bracket is found
if s_list[i] == '<':
while s_list[i] != '>':
# pop everything from the the left-angle bracket until the right-angle bracket
s_list.pop(i)
# pops the right-angle bracket, too
s_list.pop(i)
else:
i=i+1
# convert the list back into text
join_char=''
return join_char.join(s_list)
#Now just pass an HTML formatted text through this function .It remove the tags and return the string
test_txt = "This is HTML<remove> text</remove>"
st = remove_tags(test_txt)
print st # it will print "This is HTML text"


20 Sep
2009
This article shows that hoe to remove HTML tags from an input text in python.Code snippet shown in article is well described by the author.
How HTML tag used it shown belove.
<body> - This is where you will begin writing your document and placing your HTML codes.
</body> - Closes the HTML <body> tag.
If you want to remove this type of tags in your application or in your text than you can directly use this application.
This article is useful for beginner as well as intermediate.