In this Python tutorial, we will learn how to parse XML documents using ElementTree library. We have examples on how to use this library covering scenarios like accessing tag names, attributes, iterating over the child nodes, etc.
Python XML Parsing using ElementTree
ElementTree comes along with python.
We shall look into examples to parse the xml file, extract attributes, extract elements, etc. for all of the above libraries.
We shall consider following xml file for examples going forward in this tutorial.
sample.xml
<?xml version="1.0" encoding="UTF-8" ?>
<holidays year="2017">
<holiday type="other">
<date>Jan 1</date>
<name>New Year</name>
</holiday>
<holiday type="public">
<date>Oct 2</date>
<name>Gandhi Jayanti</name>
</holiday>
</holidays>
1. Get Root Tag Name
In the following program, we get the tag name of the root node.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
tag = root.tag
print(tag)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
holidays
2. Get Attributes of Root
In the following program, we access the attributes of the root node.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
# get all attributes
attributes = root.attrib
print(attributes)
# extract a particular attribute
year = attributes.get('year')
print('year : ',year)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
{'year': '2017'}
year : 2017
3. Iterate over child nodes of root
In the following program, we iterate over the child nodes of the root node using a For loop statement.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
# iterate over all the nodes with tag name - holiday
for holiday in root.findall('holiday'):
print(holiday)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
<Element 'holiday' at 0x7fb5a107d3b8>
<Element 'holiday' at 0x7fb59fc2f868>
4. Iterate over child nodes of root and get their attributes
The following program is an extension to the previous program, where we access the attributes of the children, while iterating over them.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
# iterate over child nodes
for holiday in root.findall('holiday'):
# get all attributes of a node
attributes = holiday.attrib
print(attributes)
# get a particular attribute
type = attributes.get('type')
print(type)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
{'type': 'other'}
other
{'type': 'public'}
public
5. Access Elements of a Node
In the following program, we access the elements of a specific node.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
# iterate over all nodes
for holiday in root.findall('holiday'):
# access element - name
name = holiday.find('name').text
print('name : ', name)
# access element - date
date = holiday.find('date').text
print('date : ', date)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
name : New Year
date : Jan 1
name : Gandhi Jayanti
date : Oct 2
6. Access Elements of a Node without knowing their tag names
In the following program, we access the elements of a node, iteratively, in a For loop statement.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
for holiday in root.findall('holiday'):
# access all elements in node
for element in holiday:
ele_name = element.tag
ele_value = holiday.find(element.tag).text
print(ele_name, ' : ', ele_value)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
date : Jan 1
name : New Year
date : Oct 2
name : Gandhi Jayanti
Conclusion
In this Python Tutorial, we learned how to parse an XML file using ElementTree library.