Apple iOS Health App allows you to export all your health data. The steps are straightforward:
After unzipping the export.zip , you may find different types of files. For instance, workout routes are stored as .gpx files that are GPS tracking information. The relevant file for plotting the sleep schedule is in the export.xml file.
You may look at the export.xml file by opening it in a text editor. The main structure of this XML file looks like this:
The root tag is , and there are different kinds of tags under it. Especially, we want to acquire the data in those tags belonging to SleepAnalysis, i.e., with type=”HKCategoryTypeIdentifierSleepAnalysis” , for example:
type="HKCategoryTypeIdentifierSleepAnalysis"
sourceName="MyAppleWatch"
sourceVersion="2021111"
creationDate="2021-12-01 17:14:48 -0500"
startDate="2021-12-01 08:38:00 -0500"
endDate="2021-12-01 08:43:59 -0500"
value="HKCategoryValueSleepAnalysisAsleep"
/>
To do so, first, we use the xml package to parse the XML file:
import xml.etree.ElementTree as ETtree = ET.parse("apple_health_export/export.xml")
root = tree.getroot()
You can check the tag of root , which is “HealthData” as we saw in the XML file:
>>> root.tag
'HealthData'
To iterate over the child tags under root , and append them to a list, we can perform a list comprehension:
records = [i.attrib for i in root.iter("Record")]
Then, we may convert records into Pandas.DataFrame , and cast the string-type date/time data into datetime datatype:
import pandas as pdrecords_df = pd.DataFrame(records)date_col = ['creationDate', 'startDate', 'endDate']
records_df[date_col] = records_df[date_col].apply(pd.to_datetime)
Now, we can select only the Sleep Analysis data in our records:
sleeps_df = records_df.query("type == 'HKCategoryTypeIdentifierSleepAnalysis'")
We may have some sleep records starting from one day but ending on the next day. To make the plotting process easier, we can cut those overnight records into two:
no_cross = sleeps_df[sleeps_df["startDate"].dt.day == sleeps_df["endDate"].dt.day]cross = sleeps_df[sleeps_df["startDate"].dt.day != sleeps_df["endDate"].dt.day]c1 = cross.copy()
c2 = cross.copy()c1["endDate"] = c1["startDate"].apply(lambda x: x.replace(hour=23, minute=59, second=59))c2["startDate"] = c2["endDate"].apply(lambda x: x.replace(hour=0, minute=0, second=0))sleeps_splitted_df = pd.concat([no_cross, c1, c2]).sort_values("startDate")
Finally, we can make our plot by using matplotlib . Some points are worth mentioning:
Again, The complete IPython Notebook to make the plot is available at my GitHub repo: https://github.com/c0rychu/apple-sleep
Hope you’ll enjoy~