I recently have been playing with parsing GPX files and spitting out the results into a special KML file. I initially wrote a parser using minidom, yet after running this the first time -- and my Core2Duo laptop reaching 100% utilization for 10 seconds -- I realized I needed to re-write it using something else.
I spent a little time reading the different parsers for XML and eventually read more about cElementTree. And it is included with Python2.5, sweet.
I quickly rewrote the code and did some tests. First, the two bits of code for parsing my GPX file:
minidom-speed.py
#!/usr/bin/python from xml.dom import minidom from genshi.template import TemplateLoader def collect_info(): dom = minidom.parse('airport.gpx') for node in dom.getElementsByTagName('trkpt'): lat = node.getAttribute('lat') lon = node.getAttribute('lon') speed = node.getElementsByTagName('speed')[0].firstChild.data speed = float(speed) * 10 coords = '%s,%s' % (lon, lat) coords_speed = '%s,%s' % (coords, speed) yield { 'coordinates': coords_speed } loader = TemplateLoader(['.']) template = loader.load('template-speed.kml') stream = template.generate(collection=collect_info()) f = open('minidom.kml', 'w') f.write(stream.render())
cet-speed.py
#!/usr/bin/python import sys,os import xml.etree.cElementTree as ET import string from genshi.template import TemplateLoader def collect_info(): mainNS=string.Template("{http://www.topografix.com/GPX/1/0}$tag") wptTag=mainNS.substitute(tag="trkpt") nameTag=mainNS.substitute(tag="speed") et=ET.parse(open("airport.gpx")) for wpt in et.findall("//"+wptTag): wptinfo=[] wptinfo.append(wpt.get("lon")) wptinfo.append(wpt.get("lat")) wptinfo.append(str(float(wpt.findtext(nameTag)) * 10)) coords_speed = ",".join(wptinfo) yield { 'coordinates': coords_speed, } loader = TemplateLoader(['.']) template = loader.load('template-speed.kml') stream = template.generate(collection=collect_info()) f = open('cet.kml', 'w') f.write(stream.render())
The speed difference is not just noticeable, but very noticeable.
minidom-speed.py
$ python -m cProfile minidom-speed.py 4405376 function calls (3787047 primitive calls) in 32.142 CPU seconds
cet-speed.py
$ python -m cProfile cet-speed.py 1082061 function calls (904167 primitive calls) in 6.736 CPU seconds
A quarter as many calls and almost 5x faster -- at least that's how I interpret the results. Much better!