Skip to content Skip to sidebar Skip to footer

Check If Html Tag Is Self-closing - HTMLparser - Python

Is there a way to check if a tag is a self-closing tag with HTMLparser? I know self-closing tags are handled by the built-in function: handle_startendtag() However, it only handles

Solution 1:

Not exactly a Python-specific solution, but if you want to know which tags have this "self-closing property", you can look at the official HTML5 specs: these are formally known as void elements.

area, base, br, col, embed, hr, img, input, keygen, link, menuitem,
meta, param, source, track, wbr

Strictly speaking, void elements do not have closing tags at all, but permit an extra / immediately before the >.


Solution 2:

Simple solution is to use BeautifulSoup.

In [76]: from bs4 import BeautifulSoup

In [77]: BeautifulSoup('<img src="x.jpg">')
Out[77]: <img src="x.jpg"/>

You can also check if a tag is self closing or not.

from bs4 import BeautifulSoup
from bs4.element import Tag

soup = BeautifulSoup(html)
tags = [tag for tag in soup if isinstacne(tag, Tag)
self_closing = [tag for tag in tags if tag.isSelfClosing]

Every Tag element has isSelfClosing property. So, you can filter them out.


Post a Comment for "Check If Html Tag Is Self-closing - HTMLparser - Python"